Zenfs gc sekhar feature #4

soumendus · 2021-07-06T21:58:36Z

No description provided.

ext_to_zone_map

UpdateMetadataAfterMerge() function

friend class ZenFSGCWorker;

Indentation

There was a typo

Fixed Misplaced unlock

No need for ext_to_zone_map, removing it

Changed return type from void to IOStatus

Added MoveValidDataToNewDestZone() function

…rn type and removed ext_to_zone_map related code. Modified UpdateMetadataAfterMerge() function to support IOStatus return type and removed ext_to_zone_map related code.

Changed void from IOStatus

Missing variable IOStatus

comment

Added ReadExtent() function declaration

Added definition of ReadExtent()

Modified MoveValidDataToNewDestZone()

Type cast to char*

TODO comment

Modified MoveValidDataToNewDestZone() and added helper ReadExtent()

fixed Slice param

skyzh · 2021-07-07T01:26:30Z

@soumendus I have updated your code and formatted it. Please run git pull to update your local branch.

skyzh · 2021-07-07T01:32:48Z

fs/io_zenfs.cc

+  int dont_read = 0;
+
+  // Sort the Extent list in decreasing order.
+  std::sort(extent_list.begin(), extent_list.end(),


Is is okay to change the extent order of a file? File content is store sequentially as its order of extent. Say that we have three extents, 1: 1KB, 2: 1KB, 3: 2KB. When we read from the 2048 byte, we should begin at extent 3. After we sort the extents, the file content is changed. I have mistaken this extent list for file extent list. So I have another question: How to construct the contents of ZenFSGCWorker? Do we have any plan on this?

fs/io_zenfs.cc

skyzh · 2021-07-07T04:25:47Z

fs/io_zenfs.cc

+
+    // Store the new starting position for the extent
+    // which will be later made persistent.
+    new_start = zone_dst->wp_;


If there could be multiple threads appending data to a zone, how could we ensure that new_start is really at zone_dst->wp_?

If there could be multiple threads appending data to a zone, how could we ensure that new_start is really at zone_dst->wp_?

Yes that's a valid concern. We need to figure out a way to serialize that. I did not find a way to lock the zone and operate on it or maybe there is a way. Perhaps we need a bit more thought on this.

Yes. As far as I can see, in ZenFS, there is only one writable file in one zone. AllocateZone will mark a zone as being written after allocating, making this zone not being able to used by other files for write. Maybe allocating free zone in dst_zone_list could solve this issue.

Yes. As far as I can see, in ZenFS, there is only one writable file in one zone. AllocateZone will mark a zone as being written after allocating, making this zone not being able to used by other files for write. Maybe allocating free zone in dst_zone_list could solve this issue.

Hmm. OK, we have to add that logic in the function GetDestZoneToMoveValidData() and the function should push zones to the dst_zone_list member variable.

Yes. As far as I can see, in ZenFS, there is only one writable file in one zone. AllocateZone will mark a zone as being written after allocating, making this zone not being able to used by other files for write. Maybe allocating free zone in dst_zone_list could solve this issue.

Hmm. OK, we have to add that logic in the function GetDestZoneToMoveValidData() and the function should push zones to the dst_zone_list member variable.

For getting the list of destination zones, in the function GetDestZoneToMoveValidData(), we may need to call AllocateZone() with open_for_write_ set to false or write another version of AllocateZone() ex. AllocateZoneForGC() which sets that flag to false. Then in the function MoveValidDataToNewDestinationZone(), after we fetch the current parameters of the destination zones like wp_ (write pointer), we can set the flag open_for_write_ back to true and then start appending to that zone.

As in https://github.com/bzbd/zenfs/blob/master/fs/zbd_zenfs.cc#L518, allocate a zone without setting open_for_write to true may lead to some issues. The zone might be seen in "open but not allocated" state, and will be allocated to other files. Therefore, I don't think this would work.

fs/io_zenfs.cc

skyzh · 2021-07-07T04:32:12Z

fs/io_zenfs.cc

+    ZoneFile* file_moved;
+    file_moved = *zone_file_it;
+
+    // What if the file is deleted before coming here?


I believe we should lock the metadata and scan if there is any new records before updating metadata. This may require more careful design work.

IOStatus s; is never assigned or modified throughout this function...

I believe we should lock the metadata and scan if there is any new records before updating metadata. This may require more careful design work.

SyncFileMetadata() function calls the PersistRecord() function which holds lock before updating the metadata so I did not use any lock. Maybe you are right. I was thinking maybe we can remove the UpdateMetadataAfterMerge() completely and somehow figure out to update the metadata changes in the MoveValidDataToNewDestZone() function itself. It would have been less code and easier because why to have one more function. But to do that we need to get the reference to the files whose extents have been moved. From the files, its easier to get the reference to the extents but not sure how to get the reference of the files from the extents.

Looks good. If possible, please update the design doc for new information.

skyzh · 2021-07-07T04:33:01Z

fs/io_zenfs.h

+  std::vector<ZoneFile*> files_moved_to_dst_zone;
+
+  std::atomic<uint64_t>
+      total_residue_;  // Is atomic necessary since only one thread at one time?


Seems that this variable is not used in this patch?

skyzh · 2021-07-13T02:23:36Z

fs/io_zenfs.cc

+      // will have a new starting position. No need to
+      // change the length of the extent as it will be the
+      // same.
+      ext->start_ = new_start;


Same lock issue here. When we are changing extent information, it is possible that some other operating is in the process of reading where the extent is. We need to apply some kind of lock throughout the GC process.

skyzh · 2021-07-13T02:24:07Z

fs/io_zenfs.cc

+
+      // The current zone cannot fit this extent because of lack
+      // of space, so get the next zone from the dst_zone_list.
+      zone_it++;


Should we consider the case that there is not enough destination zone?

Should we consider the case that there is not enough destination zone?

Yes, that possibility could arise. The function GetDestZoneToMoveValidData() should do that math and figure out how many zones are needed to fit all the extents(cold valid data) and called AllocateZone() for moving the valid cold data. If the desired amount of zones to fit all the cold valid data( or extents ) is not possible to allocate then it will send IOStatus error that "Not enough zones". In that case, the GC cannot progress. Maybe a slight design change can be done that we try to move only that much valid data, for which we have space(zones). Since this is a periodic background reclaim, in the next cycles, the remaining zones could be reclaimed. Just a thought, but this could make the implementation intricate.

skyzh · 2021-07-13T02:29:03Z

fs/io_zenfs.cc

+    ZoneFile* file_moved;
+    file_moved = *zone_file_it;
+
+    // What if the file is deleted before coming here?


Looks good. If possible, please update the design doc for new information.

skyzh

Generally looks good.

skyzh · 2021-07-13T14:47:11Z

I've seen an issue in RocksDB facebook/rocksdb#8504 about NoSpace condition. It seems that it's better not to use == to check the IOStatus. Instead, we should check the status code inside. (Not sure if this is necessary in ZenFS)

soumendus · 2021-07-13T16:30:06Z

I've seen an issue in RocksDB facebook/rocksdb#8504 about NoSpace condition. It seems that it's better not to use == to check the IOStatus. Instead, we should check the status code inside. (Not sure if this is necessary in ZenFS)

I saw, 's == IOStatus::NoSpace()' has been used at several other places in the ZenFS. So it might be OK.

For instance the following code in the file fs_zenfs.cc, 's == IOStatus::NoSpace()' has been used.
IOStatus ZenFS::PersistSnapshot(ZenMetaLog* meta_writer) {
.
.
s = WriteSnapshotLocked(meta_writer);
if (s == IOStatus::NoSpace()) {
Info(logger_, "Current meta zone full, rolling to next meta zone");
s = RollMetaZoneLocked();
}
.
.
}

soumendus added 24 commits June 27, 2021 11:47

Update io_zenfs.h

96954ab

ext_to_zone_map

Update io_zenfs.h

6d6e5d7

Update io_zenfs.cc

9aa4ccd

UpdateMetadataAfterMerge() function

Update io_zenfs.cc

90cbf8b

Update fs_zenfs.h

b5674a2

friend class ZenFSGCWorker;

Update io_zenfs.h

e7435f3

Indentation of file metadata function

313ca13

Indentation

Removed a typo in UpdateMetadataAfterMerge()

5890cac

There was a typo

Fixed Misplaced unlock

f3df7ed

Fixed Misplaced unlock

No need for ext_to_zone_map, removing it

c9a45ec

No need for ext_to_zone_map, removing it

Changed return type from void to IOStatus

f6210df

Changed return type from void to IOStatus

Added MoveValidDataToNewDestZone() function

7f67872

Added MoveValidDataToNewDestZone() function

Modified UpdateMetadataAfterMerge() function to support IOStatus retu…

afb8924

…rn type and removed ext_to_zone_map related code. Modified UpdateMetadataAfterMerge() function to support IOStatus return type and removed ext_to_zone_map related code.

Changed void from IOStatus

15e752e

Changed void from IOStatus

Missing variable IOStatus

583567c

Missing variable IOStatus

TODO comment

69c733b

comment

Added ReadExtent() function declaration

73795f0

Added ReadExtent() function declaration

Added definition of ReadExtent()

3685f5b

Added definition of ReadExtent()

Modified MoveValidDataToNewDestZone()

eac8b8b

Modified MoveValidDataToNewDestZone()

Type cast to char*

e9bc3ff

Type cast to char*

TODO comment

aaa1695

TODO comment

Removed comment

efe7c4d

Modified MoveValidDataToNewDestZone() and added helper ReadExtent()

c9b2e52

Modified MoveValidDataToNewDestZone() and added helper ReadExtent()

fixed Slice param

f53fdbb

fixed Slice param

soumendus requested review from levichen94, royguo, shampoo365, Yuanliang-Wang and ZYYByteDance and removed request for royguo July 6, 2021 22:01

soumendus removed request for levichen94, shampoo365, Yuanliang-Wang and ZYYByteDance July 6, 2021 22:02

royguo and others added 5 commits July 7, 2021 06:42

Comment

a8ea09f

comment

e529a1e

ci: specify pointer alignment for clang-format 12

a4eea36

Merge branch 'master' into zenfs_gc_sekhar_feature

010b413

format code and merge master

f023556

skyzh reviewed Jul 7, 2021

View reviewed changes

soumendus added 2 commits July 7, 2021 00:01

Fix for memory leak comment by Alex Chi

8476c8d

Fixed toggling of the flag dont_read

53b40b6

skyzh reviewed Jul 7, 2021

View reviewed changes

fs/io_zenfs.cc Outdated Show resolved Hide resolved

skyzh reviewed Jul 7, 2021

View reviewed changes

soumendus added 9 commits July 7, 2021 22:07

Removed ptr null check before delete

83cae4a

Fix: dont_read flag set to zero

459696c

Use the IOStatus variable

6dc51ef

Moved status check inside dont_read branch

e22f9a3

Aligned address for zone append

530b7a0

Removed code under #if 0

d703522

Added comment

2a10511

Alignment

2f85c63

clang-format aligned

26eccac

skyzh reviewed Jul 13, 2021

View reviewed changes

skyzh force-pushed the master branch from a4eea36 to 5c75717 Compare July 19, 2021 02:51

weilewei mentioned this pull request Sep 15, 2021

WIP: bugFix background Worker and add unit tests #37

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zenfs gc sekhar feature #4

Zenfs gc sekhar feature #4

soumendus commented Jul 6, 2021

skyzh commented Jul 7, 2021

skyzh Jul 7, 2021

skyzh Jul 7, 2021

soumendus Jul 13, 2021

skyzh Jul 13, 2021

soumendus Jul 13, 2021

soumendus Jul 18, 2021

skyzh Jul 19, 2021

skyzh Jul 7, 2021 •

edited

Loading

skyzh Jul 7, 2021

soumendus Jul 13, 2021

skyzh Jul 13, 2021

skyzh Jul 7, 2021

skyzh Jul 13, 2021

skyzh Jul 13, 2021

soumendus Jul 13, 2021

skyzh Jul 13, 2021

skyzh left a comment

skyzh commented Jul 13, 2021 •

edited

Loading

soumendus commented Jul 13, 2021

Zenfs gc sekhar feature #4

Are you sure you want to change the base?

Zenfs gc sekhar feature #4

Conversation

soumendus commented Jul 6, 2021

skyzh commented Jul 7, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skyzh Jul 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

skyzh left a comment

Choose a reason for hiding this comment

skyzh commented Jul 13, 2021 • edited Loading

soumendus commented Jul 13, 2021

skyzh Jul 7, 2021 •

edited

Loading

skyzh commented Jul 13, 2021 •

edited

Loading