On 2020/6/15 下午8:50, Greed Rong wrote: > Does that mean about 2^20 subvolumes can be created in one root btrfs? Unfortunately that 1<<20 limit is shared across a lot of filesystemts, like overlayfs, ceph and btrfs. Furthermore the pool is a global pool, which means it's shared by all btrfs filesystems. So in one btrfs, it's way smaller than 1<<20. > > The snapshot delete service was stopped a few weeks ago. I think this > is the reason why the id pool is exhausted. > I will try to run it again and see if it works. At least we're working on workaround the limit, by: - Reduce known unnecessary users of the pool Reloc tree/data reloc tree don't need to utilize the pool - Prealloc the id to prevent transaction abort So the user would get error from ioctl, other than forcing the whole fs to be RO later. Thanks, Qu > > Thanks > > On Sat, Jun 13, 2020 at 1:13 AM David Sterba wrote: >> >> On Fri, Jun 12, 2020 at 11:15:43AM +0800, Greed Rong wrote: >>> This server is used for network storage. When a new client arrives, I >>> create a snapshot of the workspace subvolume for this client. And >>> delete it when the client disconnects. >> >> NFS, cephfs and overlayfs use the same pool of ids, in combination with >> btrfs snapshots the consumption might be higher than in other setups. >> >>> Most workspaces are PC game programs. It contains thousands of files >>> and Its size ranges from 1GB to 20GB. >> >> We can rule out regular files, they don't affect that, and the numbers >> you posted are all normal. >> >>> About 200 windows clients access this server through samba. About 20 >>> snapshots create/delete in one minute. >> >> This is contributing to the overall consumption of the ids from the >> pool, but now it's shared among the network filesystem and btrfs. >> >> Possible explanation would be leak of the ids, once this state is hit >> it's permament so no new snapshots could be created or the network >> clients will start getting some other error. >> >> If there's no leak, then all objects that have the id attached would >> need to be active, ie. snapshot part of a path, network client >> connected to it's path. This also means some sort of caching, so the ids >> are not returned back right away. >> >> For the subvolumes the ids get returned once the subvolume is deleted >> and cleaned, which might take time and contribute to the pool >> exhaustion. I need to do some tests to see if we could release the ids >> earlier.