Linux-BTRFS Archive on lore.kernel.org
 help / color / Atom feed
From: halfdog <me@halfdog.net>
To: linux-btrfs@vger.kernel.org
Subject: FIDEDUPERANGE woes
Date: Thu, 12 Dec 2019 16:15:49 +0000
Message-ID: <2019-1576167349.500456@svIo.N5dq.dFFD> (raw)

[-- Attachment #1: Type: text/plain, Size: 2373 bytes --]

Hello list,

Using btrfs on

Linux version 5.3.0-2-amd64 (debian-kernel@lists.debian.org) (gcc version 9.2.1 20191109 (Debian 9.2.1-19)) #1 SMP Debian 5.3.9-3 (2019-11-19)

the FIDEDUPERANGE exposes weird behaviour on two identical but
not too large files that seems to be depending on the file size.
Before FIDEDUPERANGE both files have a single extent, afterwards
first file is still single extent, second file has all bytes sharing
with the extent of the first file except for the last 4096 bytes.

Is there anything known about a bug fixed since the above mentioned
kernel version?



If no, does following reproducer still show the same behaviour
on current Linux kernel (my Python test tools also attached)?

> dd if=/dev/zero bs=1M count=32 of=disk
> mkfs.btrfs --mixed --metadata single --data single --nodesize 4096 disk
> mount disk /mnt/test
> mkdir /mnt/test/x
> dd bs=1 count=155489 if=/dev/urandom of=/mnt/test/x/file-0
> cat /mnt/test/x/file-0 > /mnt/test/x/file-1

> ./SimpleIndexer x > x.json
> ./IndexDeduplicationAnalyzer --IndexFile /mnt/test/x.json /mnt/test/x > dedup.list
Got dict: {b'/mnt/test/x/file-0': [(0, 5316608, 155648)], b'/mnt/test/x/file-1': [(0, 5472256, 155648)]}
...
> strace -s256 -f btrfs-extent-same 155489 /mnt/test/x/file-0 0 /mnt/test/x/file-1 0 2>&1 | grep -E -e FIDEDUPERANGE
ioctl(3, BTRFS_IOC_FILE_EXTENT_SAME or FIDEDUPERANGE, {src_offset=0, src_length=155489, dest_count=1, info=[{dest_fd=4, dest_offset=0}]} => {info=[{bytes_deduped=155489, status=0}]}) = 0
> ./IndexDeduplicationAnalyzer --IndexFile /mnt/test/x.json /mnt/test/x > dedup.list
Got dict: {b'/mnt/test/x/file-0': [(0, 5316608, 155648)], b'/mnt/test/x/file-1': [(0, 5316608, 151552), (151552, 5623808, 4096)]}
...
> strace -s256 -f btrfs-extent-same 155489 /mnt/test/x/file-0 0 /mnt/test/x/file-1 0 2>&1 | grep -E -e FIDEDUPERANGE
ioctl(3, BTRFS_IOC_FILE_EXTENT_SAME or FIDEDUPERANGE, {src_offset=0, src_length=155489, dest_count=1, info=[{dest_fd=4, dest_offset=0}]} => {info=[{bytes_deduped=155489, status=0}]}) = 0
> strace -s256 -f btrfs-extent-same 4096 /mnt/test/x/file-0 151552 /mnt/test/x/file-1 151552 2>&1 | grep -E -e FIDEDUPERANGE
ioctl(3, BTRFS_IOC_FILE_EXTENT_SAME or FIDEDUPERANGE, {src_offset=151552, src_length=4096, dest_count=1, info=[{dest_fd=4, dest_offset=151552}]}) = -1 EINVAL (Invalid argument)

[-- Attachment #2: SimpleIndexer --]
[-- Type: text/plain, Size: 7566 bytes --]

#!/usr/bin/python3 -BbbEIsSttW all

"""This module implements a simple file system indexer creating
a JSON intermediate format to be then imported to the standard
backup tools chain. This allows indexing of file systems without
having the full Java based backup stack installed.

This software is provided by the copyright owner "as is" and
without any expressed or implied warranties, including, but not
limited to, the implied warranties of merchantability and fitness
for a particular purpose are disclaimed. In no event shall the
copyright owner be liable for any direct, indirect, incidential,
special, exemplary or consequential damages, including, but not
limited to, procurement of substitute goods or services, loss
of use, data or profits or business interruption, however caused
and on any theory of liability, whether in contract, strict liability,
or tort, including negligence or otherwise, arising in any way
out of the use of this software, even if advised of the possibility
of such damage.

Copyright (c) 2019 halfdog <me (%) halfdog.net>

You may not distribute copies, re-license, sell or make any kind
of profit with this software without proper prior written consent
from the author. You may not remove this copyright notice, license
or terms of use.

PS: Restrictive license as I did not have time to publish this
backup data quality verification, deduplication, data synchronization
tools. I would be be willing to do so if someone assists in getting
software distributed as Debian package."""

import hashlib
import json
import os
import re
import stat
import sys


def fileNameToResourceName(fileName):
  nameBytes = fileName.encode(sys.getdefaultencoding(), errors='surrogateescape')
  nameBytesSanitized = [chr(val) if val in b'0123456789-.ABCDEFGHIJKLMNOPQRSTUVWXYZ_abcdefghijklmnopqrstuvwxyz~' else ('%%%02x' % (val&0xff)) for val in nameBytes]
  return ''.join(nameBytesSanitized)

def listDir(pathName, parentResourceUrl, excludeRegexList):
  """List the directory, apply exclude rules, sort all results
  and return the list suitable for indexing stack use."""
  resultList = []
  for fileName in os.listdir(pathName):
    filePathName = os.path.join(pathName, fileName)
    resourceName = fileNameToResourceName(fileName)
    fileResourceName = parentResourceUrl+resourceName
# Add '/' to directories but not links pointing to directories.
    if stat.S_ISDIR(os.lstat(filePathName).st_mode):
      fileResourceName += '/'

    if excludeRegexList:
      for regex in excludeRegexList:
        if regex.match(fileResourceName):
          fileResourceName = None
          break
    if fileResourceName is not None:
      resultList.append((filePathName, fileResourceName))
  resultList.sort(key=lambda x: x[1])
  return resultList

def createDigest(pathName):
  """Create a hexadecimal digest string for the file pointed
  out by pathname."""
  digestObj = hashlib.md5()
  digestFile = open(pathName, 'rb')
  while True:
    digestData = digestFile.read(1<<24)
    if not digestData:
      break
    digestObj.update(digestData)
  digestFile.close()
  return digestObj.hexdigest()


def doIndex(rootPathName, excludeRegexStrList=None, indexStartPath=None,
    updateIndexFile=None):
  """Create an index for a given file system part.
  @param rootPathName byte string pointing to the file system
  root for indexing.
  @param excludeRegexStrList a list of regular expression strings
  to match against the ResourceUrl."""

  excludeRegexList = None
  if excludeRegexStrList:
    excludeRegexList = [re.compile(regexStr) for regexStr in excludeRegexStrList]

  oldIndexData = []
  oldIndexDataPos = 0
  if updateIndexFile:
    indexFile = open(updateIndexFile, 'rb')
    oldIndexData = indexFile.read()
    indexFile.close()
    oldIndexData = json.loads(str(oldIndexData, 'ascii'))

# Have an indexing stack with a sorted list of elements to visit,
# the index of the next element and the ResourceUrl up to the
# level currently being processed.
  indexStack = []
  indexStack.append([[(rootPathName, '/')], 0, '/'])

# Begin the JSON output format list.
  continuationFlag = False
  print('[')
  while indexStack:
    stackFrame = indexStack[-1]
    fileSystemPath, resourceUrl = stackFrame[0][stackFrame[1]]
# Zero out the element to avoid having all large number of items
# linked when not needed any more.
    stackFrame[0][stackFrame[1]] = None
    stackFrame[1] += 1
    if stackFrame[1] == len(stackFrame[0]):
      del indexStack[-1]

    oldRecord = None
    for oldIndexDataPos in range(oldIndexDataPos, len(oldIndexData)):
      if oldIndexData[oldIndexDataPos]['url'] < resourceUrl:
        continue
      if oldIndexData[oldIndexDataPos]['url'] == resourceUrl:
        oldRecord = oldIndexData[oldIndexDataPos]
      break

    indexResult = {}
    statData = os.lstat(fileSystemPath)
    indexResult['group'] = statData.st_gid
    indexResult['inode'] = statData.st_ino
    indexResult['length'] = statData.st_size
    indexResult['mode'] = statData.st_mode & 0o7777
    indexResult['mtime'] = statData.st_mtime
    indexResult['type'] = None
    indexResult['url'] = resourceUrl
    indexResult['user'] = statData.st_uid

    if stat.S_ISDIR(statData.st_mode):
      indexResult['type'] = 'dir'
      subResourceList = listDir(
          fileSystemPath, resourceUrl, excludeRegexList)
      if subResourceList:
        indexStack.append([subResourceList, 0, resourceUrl])
    elif stat.S_ISFIFO(statData.st_mode):
      indexResult['type'] = 'pipe'
    elif stat.S_ISLNK(statData.st_mode):
      indexResult['type'] = 'link'
      indexResult['typedata'] = os.readlink(fileSystemPath)
    elif stat.S_ISREG(statData.st_mode):
      indexResult['type'] = 'file'
# Only this step should be skipped if old and new entry are identical.
      if oldRecord is not None:
        indexResult['digest-md5'] = oldRecord['digest-md5']
      if oldRecord != indexResult:
        indexResult['digest-md5'] = createDigest(fileSystemPath)
    elif stat.S_ISSOCK(statData.st_mode):
      indexResult['type'] = 'socket'
    else:
      raise Exception('Unhandled file type for %s' % fileSystemPath)
    recordData = json.dumps(indexResult, sort_keys=True)
    if(continuationFlag):
      sys.stdout.write(',\n%s' % recordData)
    else:
      sys.stdout.write('%s' % recordData)
      continuationFlag = True
  print(']')


def mainFunction():
  """This is the main function to analyze the program call arguments
  and invoke indexing."""

  excludeRegexStrList = []
  indexStartPath = None
  updateIndexFile = None

  argPos = 1
  while argPos < len(sys.argv)-1:
    argName = sys.argv[argPos]
    argPos += 1
    if argName == '--':
      break
    if argName == '--Exclude':
      excludeRegexStrList.append(sys.argv[argPos])
      argPos += 1
      continue
    if argName == '--Include':
      indexStartPath = sys.argv[argPos]
      argPos += 1
      continue
    if argName == '--Update':
      updateIndexFile = sys.argv[argPos]
      argPos += 1
      continue
    break

  if argPos+1 != len(sys.argv):
    print('No indexing root path given (last argument)', file=sys.stderr)
    sys.exit(1)
  rootPathName = sys.argv[argPos]
  doIndex(
      rootPathName, excludeRegexStrList=excludeRegexStrList,
      indexStartPath=indexStartPath, updateIndexFile=updateIndexFile)


if __name__ == '__main__':
  mainFunction()

[-- Attachment #3: IndexDeduplicationAnalyzer --]
[-- Type: text/plain, Size: 10900 bytes --]

#!/usr/bin/python3 -BbbEIsSttW all

"""This module implements a simple index file analyzer to find
resources eligible for deduplication based on the file digest.
Therefore block based deduplication candidates cannot be found
by this tool.

As all duperemove does not use the FS_IOC_FIEMAP system call
for some reason, duperemove runs are VERY slow. Therefore perform
these checks here.


This software is provided by the copyright owner "as is" and
without any expressed or implied warranties, including, but not
limited to, the implied warranties of merchantability and fitness
for a particular purpose are disclaimed. In no event shall the
copyright owner be liable for any direct, indirect, incidential,
special, exemplary or consequential damages, including, but not
limited to, procurement of substitute goods or services, loss
of use, data or profits or business interruption, however caused
and on any theory of liability, whether in contract, strict liability,
or tort, including negligence or otherwise, arising in any way
out of the use of this software, even if advised of the possibility
of such damage.

Copyright (c) 2019 halfdog <me (%) halfdog.net>

You may not distribute copies, re-license, sell or make any kind
of profit with this software without proper prior written consent
from the author. You may not remove this copyright notice, license
or terms of use.

PS: Restrictive license as I did not have time to publish this
backup data quality verification, deduplication, data synchronization
tools. I would be be willing to do so if someone assists in getting
software distributed as Debian package."""

import ctypes
import fcntl
import json
import os
import subprocess
import sys


class FiemapExtentStruct(ctypes.Structure):
  _fields_ = [('logicalOffset', ctypes.c_int64), ('physicalOffset', ctypes.c_int64), ('length', ctypes.c_int64), ('reserved64', ctypes.c_int64*2), ('flags', ctypes.c_int32), ('reserved32', ctypes.c_int32*3)]

class FiemapStruct(ctypes.Structure):
  _fields_ = [('startOffset', ctypes.c_int64), ('mapLength', ctypes.c_int64), ('flags', ctypes.c_int32), ('extentCopied', ctypes.c_int32), ('extentsAvailable', ctypes.c_int32), ('reserved', ctypes.c_int32)]


class FiemapExtentHelper():
  """This class wraps the helper for checking a list of files
  if their extends are already deduplicated."""

  def __init__(self):
    self.fiemap = FiemapStruct()
    self.maxExtentsCount = 4096
    self.fiemapExtent = FiemapExtentStruct()
    self.buffer = (ctypes.c_int8*(ctypes.sizeof(self.fiemap)+ctypes.sizeof(FiemapExtentStruct)*self.maxExtentsCount))()

  def checkFiemapExtentsMatch(self, dataCheckLength, fileNameList):
    """Check if all extents alread match.
    @return True when all extents are the same."""
    extentDataDict = {}
    for fileName in fileNameList:
      self.fiemap.startOffset = 0
      self.fiemap.mapLength = dataCheckLength
# FIEMAP_FLAG_SYNC 1
      self.fiemap.flags = 1
      self.fiemap.extentCopied = 0
      self.fiemap.extentsAvailable = self.maxExtentsCount
      ctypes.memmove(
          ctypes.addressof(self.buffer), ctypes.addressof(self.fiemap),
          ctypes.sizeof(self.fiemap))

      testFd = os.open(fileName, os.O_RDONLY|os.O_NOFOLLOW|os.O_NOCTTY)
      result = fcntl.ioctl(testFd, 0xc020660b, self.buffer, True)
      os.close(testFd)
      if result != 0:
        raise Exception()
      ctypes.memmove(
          ctypes.addressof(self.fiemap), self.buffer,
          ctypes.sizeof(FiemapStruct))
      if (self.fiemap.extentCopied >= self.maxExtentsCount) or \
          (self.fiemap.extentsAvailable > self.maxExtentsCount):
        raise Exception(
            'Extents list exhausted: copied %d, available %d' % (
                self.fiemap.extentCopied, self.fiemap.extentsAvailable))
      extentList = []
      for extentPos in range(0, self.fiemap.extentCopied):
        ctypes.memmove(
            ctypes.addressof(self.fiemapExtent),
            ctypes.addressof(self.buffer)+ctypes.sizeof(FiemapStruct)+extentPos*ctypes.sizeof(FiemapExtentStruct),
            ctypes.sizeof(FiemapExtentStruct))
        extentList.append((
            self.fiemapExtent.logicalOffset,
            self.fiemapExtent.physicalOffset,
            self.fiemapExtent.length))
      extentDataDict[fileName] = extentList
    print('Got dict: %s' % repr(extentDataDict), file=sys.stderr)
    refExtentList = None
    for fileName, extentList in extentDataDict.items():
      if refExtentList is None:
        refExtentList = extentList
      else:
        if extentList != refExtentList:
          return False
    return True


class IndexedLocation():
  """This class stores information about each indexed location."""

  def __init__(self, dataPathName, indexFileName):
    self.dataPathName = dataPathName
    self.indexFileName = indexFileName

  def getDataPath(self):
    return self.dataPathName

  def getIndexData(self):
    indexFile = open(self.indexFileName, 'rb')
    indexData = indexFile.read()
    indexFile.close()
    return json.loads(str(indexData, 'ascii'))


class DedupAnalyzer():
  def __init__(self):
# 0: normal, 1: info, 2: debug
    self.logLevel = 0
    self.fiemapExtentHelper = FiemapExtentHelper()
    self.dedupFileList = []

  def addFile(self, fileNameBytes):
    """Add the given filename to the output performing the appropriate
    resource URL transformations."""
    if fileNameBytes.find(b'\n') >= 0:
      raise Exception('Output format does not support newlines')
    checkPos = 0
    while True:
      checkPos = fileNameBytes.find(b'%', checkPos)
      if checkPos < 0:
        break
      replaceByte = int.to_bytes(
          int(fileNameBytes[checkPos+1:checkPos+3], 16), 1, 'big')
      fileNameBytes = fileNameBytes[:checkPos]+replaceByte+fileNameBytes[checkPos+3:]
      checkPos += 1
    self.dedupFileList.append(fileNameBytes)


  def flushDedupList(self, dataCheckLength):
    if len(self.dedupFileList) > 1:
      if self.fiemapExtentHelper.checkFiemapExtentsMatch(
          dataCheckLength, self.dedupFileList):
        if self.logLevel >= 1:
          print(
              'Files %s already deduplicated' % repr(self.dedupFileList),
              file=sys.stderr)
      else:
        sys.stdout.buffer.write(b'\n'.join(self.dedupFileList)+b'\n\n')
    self.dedupFileList = []


  def createDeduplicationData(self, indexLocationList):
    """Create the deduplication data to be feed into duperemove.
    The function extracts all hash/length/path results from all
    indices, passes them to sort for memory efficient sorting and
    uses the output to create the duperemove output format."""

    sortProcess = subprocess.Popen(
        ['/usr/bin/sort'], stdin=subprocess.PIPE, stdout=subprocess.PIPE)
    for indexLocation in indexLocationList:
      dataPath = bytes(indexLocation.getDataPath(), 'ascii')
      indexData = indexLocation.getIndexData()
      for indexRecord in indexData:
        if indexRecord['type'] in ['dir', 'link', 'pipe', 'socket']:
          continue
        if 'digest-md5' not in indexRecord:
          raise Exception('Strange %s' % repr(indexRecord))
        sortProcess.stdin.write(b'%s %d %s%s\n' % (
            bytes(indexRecord['digest-md5'], 'ascii'), indexRecord['length'],
            dataPath, bytes(indexRecord['url'], 'ascii')))
    sortProcess.stdin.close()

# Now read all entries.
    lastDigest = None
    lastLength = None
    dataBuffer = b''
    while True:
      lineEnd = dataBuffer.find(b'\n')
      if lineEnd < 0:
        inputData = sortProcess.stdout.read(1<<20)
        if not inputData:
          if len(dataBuffer):
            raise Exception('Unhandled input data %s' % repr(dataBuffer))
          self.flushDedupList(lastLength)
          sys.stdout.buffer.write(b'\n')
          break
        dataBuffer += inputData
        continue
      lineData = dataBuffer[:lineEnd].split(b' ')
      dataBuffer = dataBuffer[lineEnd+1:]

      if len(lineData) != 3:
        raise Exception('Malformed line %s' % repr(lineData))

      if lineData[0] != lastDigest:
        self.flushDedupList(lastLength)
        dedupFileList = []

        lastDigest = lineData[0]
        lastLength = int(lineData[1])
        self.addFile(lineData[2])
        continue

      if int(lineData[1]) != lastLength:
        raise Exception('Collision')

# Ignore empty files.
      if lastLength == 0:
        continue

# This is a duplicate.
      self.addFile(lineData[2])

    sortProcess.stdout.close()
    processResult = sortProcess.wait()
    if processResult != 0:
      print('Sort terminated with error %d' % processResult, file=sys.stderr)
      sys.exit(1)
    print('Dedupe search complete', file=sys.stderr)


def mainFunction():
  """This is the main function to analyze the program call arguments
  and invoke indexing."""

  indexLocationList = []

  argPos = 1
  while argPos < len(sys.argv)-1:
    argName = sys.argv[argPos]
    argPos += 1
    if argName == '--':
      break
    if argName == '--IndexedDir':
      dataPathName = os.path.realpath(sys.argv[argPos])
      argPos += 1
      if dataPathName == '/':
        print(
            'Cannot use %s as indexed dir as no parent directory exists' % dataPathName, file=sys.stderr)
        sys.exit(1)
      indexFileName = os.path.normpath('%s-Index.json' % dataPathName)
      if (not os.path.isdir(dataPathName)) or \
          (not os.path.exists(indexFileName)):
        print(
            'Data path %s or index file %s does not exist' % (
                repr(dataPathName), repr(indexFileName)), file=sys.stderr)
        sys.exit(1)
      indexLocationList.append(IndexedLocation(dataPathName, indexFileName))
      continue
    if argName == '--IndexFile':
      if argPos+2 > len(sys.argv):
        print(
            '--IndexFile requires index file and data path argument',
            file=sys.stderr)
        sys.exit(1)
      indexFileName = os.path.realpath(sys.argv[argPos])
      argPos += 1
      dataPathName = os.path.realpath(sys.argv[argPos])
      argPos += 1
      if (not os.path.isdir(dataPathName)) or \
          (not os.path.exists(indexFileName)):
        print(
            'Data path %s or index file %s does not exist' % (
                repr(dataPathName), repr(indexFileName)), file=sys.stderr)
        sys.exit(1)
      indexLocationList.append(IndexedLocation(dataPathName, indexFileName))
      continue

    print('Unsupported argument %s' % argName, file=sys.stderr)
    sys.exit(1)

  analyzer = DedupAnalyzer()
  analyzer.createDeduplicationData(indexLocationList)

if __name__ == '__main__':
  mainFunction()

             reply index

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-12-12 16:15 halfdog [this message]
2019-12-13  9:32 ` Qu Wenruo
2019-12-14  5:19   ` Zygo Blaxell
2019-12-14 11:21     ` Filipe Manana
2020-01-30 16:59 ` David Sterba

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2019-1576167349.500456@svIo.N5dq.dFFD \
    --to=me@halfdog.net \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-BTRFS Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-btrfs/0 linux-btrfs/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-btrfs linux-btrfs/ https://lore.kernel.org/linux-btrfs \
		linux-btrfs@vger.kernel.org
	public-inbox-index linux-btrfs

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-btrfs


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git