Talk:Extent (file systems)

Latest comment: 8 years ago by Chininazu12 in topic NOTOC?

Unnamed section

edit

On the "Comparison of File Systems" page, Apple's HFS+ filesystem is listed as supporting Extents. However it's not listed on this page -- perhaps it should be? I'm not familar enough with the system to know whether it's true or not, so I won't add it myself, but maybe somebody else can confirm or deny and rectify the discrepency. It seems that one page or the other needs to be changed. -- Kadin2048 15:38, 6 January 2006 (UTC)Reply

How is allocating an extent to a file different to pre-allocating a contiguous series of blocks, if the file-system supports packing more than one file into an extent... and how does extent allocation differ from just using a larger block size? Martin Rudat(T|@|C) 08:56, 18 February 2006 (UTC)Reply

At least as I see it, a file system that "supports extents" has "file maps" (the "file map" being the per-file data structure used to map from offsets in the file to blocks on disk) that can contain entries that can map a variable-length range of file offsets to a contiguous sequence of disk blocks, while a file system that doesn't "support extents" has file maps that have entries that map a fixed-size range of offsets to a fixed-size contiguous sequence of disk blocks.
With that model, a larger block size for a file system that doesn't "support extents" just increases the size of the fixed-size ranges of offsets; a file system that "supports extents" can allocate extents with different sizes to the same file. You don't need extents to support pre-allocating a contiguous series of blocks, just to allow a single file map entry to point to the entire contiguous series.
Are there any file systems that support packing more than one file into an extent? That'd probably be useful only for files (or file tails) smaller than the minimum size of an extent; otherwise, you might as well split the extent into multiple smaller extents and assign each of those smaller extents in its entirety to the file whose data it contains. Guy Harris 23:33, 28 October 2006 (UTC)Reply
In a FS without extents, if you allocate x contiguous blocks for a file and the file needs to grow beyond those x blocks, there may be the case where you simply can't let it grow, or it would overwrite other used blocks.
By using extents, you can allocate another chunk of contiguous space, located somewhere else, and thus solve this problem.
This concept can be found in Silberschatz's "Operating System Concepts", and i think it's not explaind clearly (or at all) in the article. Should I edit it? --Asymmetric (talk) 14:15, 13 April 2008 (UTC)Reply
How is this any different from when a "block-based" FS fragments a file? When using FAT, if you have a file that is 2350 clusters in size and you're going to append 400 clusters to the end of it, but there is another file in the way that prevents contiguous allocation, doesn't the OS typically search the FAT for a 400-cluster chunk of free space, allocate it, and continue? This article doesn't differentiate an "extent" from "a series of contiguous disk blocks." What's the difference? I see the same thing posed in discussion below, and I think that this article seriously requires details on what an "extent" is and HOW IT IS DIFFERENTIATED FROM OTHER FILE ALLOCATION TECHNIQUES (i.e. blocks or logs). I may do it if I ever find the time. Daivox (talk) 00:58, 5 September 2008 (UTC)Reply
The point is that the file system has no way of knowing if you are going to append 4k or 4G to a file until you actually write the data, so the filesystem has to allocate an extent. When the file is closed, it should deallocate the space that wasn't used.85.95.41.77 (talk) 20:21, 20 August 2009 (UTC)Reply

extent

edit

Shouldn't the definition of "extent" include "the entire scope of the boundaries (of something)"? Shortylumber 17:28, 5 September 2006 (UTC)shortylumberReply

If "extent" is merely a pre-reserved contiguous region of blocks given to a file, then Apple DOS 3.3 had "extents", as well as Microsoft's Fat12. Both use bitmaps to record which regions of the disk are used, and both allocate an entire track of space to a file opened for writing, regardless of how much space is requested, then recover the unused blocks when the file is closed. Both lose space on disk rapidly when applications/filehandles are not properly closed. Luckily FAT12 shipped with CHKDSK to report exactly how much of the disk space was lost, so that users could tell when to reformat their disk (or purchase 3rd party tools to fix it.) -- w_barath 6:08pm, 02 August 2008 (PDT) —Preceding undated comment was added at 01:09, 3 August 2008 (UTC)Reply

extents - definition of "extent" is too vague in this article?

edit

It's worth to mention that used extent definition appears to be too vague and too wide. Globally, it looks like if there are two classes of file systems have formed historically.

  • "Classic" block-based filesystem designs are usually marking space as occupied by adjusting block allocation bitmaps, so if 100 000 blocks become busy, 100 000 bits have to be toggled. Something like this is being done by ext2/3 or NTFS (in their $Bitmap). While this approach does not inherently prevents huge contiguous blocks from being allocated in sequential manner, it makes allocating large contiguous extents quite inefficient due to need to changing states of large numbers of bits in bitmaps, which is obviously slow for large "extents". So I would rather refer such design as "group of blocks located together" rather then true "extent-based designs".
  • The second class of filesystems are those who has abandoned this idea of block allocation via bitmaps and rather marks whole bunch of contiguous blocks as busy via one compact structure which defines whole range of blocks as occupied. Then these structures and free space could be indexed in b-trees so to improve allocation speed, etc. Most obvious examples would be EXT4, XFS, BTRFS, etc. The major difference is amount of metadata it takes to change to describe the same allocation (toggling 100 000 bits vs writing "blocks 1 to 100 000 are used now" mark). When it comes to large files and volumes, bitmaps are getting too big and their update slows down whole filesystem. Modern "true" extent based designs like ext4 do not suffer from this problem since they use compact structure to describe large extent at once, hence very little CPU processing power needed to handle large chunk of data.

What's the problem with used definition? Honestly, I do not see what would prevent each and every filesystem to be called as "extent based" as long as it implements obvious optimization of grouping nearby blocks together and attempting to write sequential blocks into sequental gap in free space. So, for example, with used definition EXT2 is a "extent based" as long as it's driver attempts to write files in sequential way. However in common developers sense, ext2 is not extent-based, just a plain old block-based design.

I would propose to refine extent in way like "extent is a large contiguous set of blocks which is defined via compact structure describing range of block numbers being used rather than individually marking each block as allocated in allocation bitmaps". As for me, it's marking each block in allocation bitmaps vs writing short structure describing

HPFS

edit

HPFS can preallocate space before a write. Is that the same?

VTOCs on MVS, z/OS, etc

edit

The Volume Table of Contents (VTOC) on an MVS volume, describes space assigned to data sets (files, in more common terminology), free space, and even its own size in terms of extents.

Rlhamil 16:27, 26 March 2007 (UTC)Reply

Ext4 is missing here! —Preceding unsigned comment added by 80.223.145.74 (talk) 15:25, 17 December 2007 (UTC)Reply

NTFS

edit

Is it really valid to list NTFS here? The described method requires the developer to perform file writes in a specific manner to make this happen. In my opinion this doesn't make NTFS qualify as an extent based file system. I believe that in order for a FS to qualify it must perform this task automatically; if we're going to allow this as part of the definition then nearly any file system should be able to make it to this list. For example, one could make nearly any file system pre-allocate space by using comparable calls. It should also be noted that the specified Win32 API calls do NOT guarantee that this space be contiguous! The system will try to make the space contiguous but it might not be possible, so I really believe that by these standards literally any file system could be put on this list, and therefore NTFS should be removed. —Preceding unsigned comment added by 24.222.116.189 (talk) 11:51, 2 January 2008 (UTC)Reply

I just wanted to add that it looks like NTFS was removed from the list (probably rightly so), but the file system matrix on the following wiki page still shows the contrary: http://en.wiki.x.io/wiki/Comparison_of_file_systems --May 8, 2008 —Preceding unsigned comment added by 99.201.204.33 (talk) 18:00, 8 May 2008 (UTC)Reply
NTFS addresses file data in extents, rather than in individual blocks like ext3, but it doesn't automatically preallocate. Frankly, I'm not sure how it or any other file system could reasonably expected to do this, because it doesn't know what the required size will be, and preallocating too much space would tend to _increase_ fragmentation. 166.70.56.217 (talk) 01:45, 30 November 2008 (UTC)Reply
Actually, NTFS seems to use $Bitmap to mark blocks as allocated. So if file with 100 000 blocks is being allocated, looks like NTFS would need to toggle 100 000 bits in $Bitmap to address changes to allocation, so there is change for every block needed. Which is fairly slow operation. If this design is "extent-based", ext2 is definitely extent-based as well as there is nothing prevents filesystem driver from trying to allocate several contigious blocks in one shot, writing blocks in a grouped manner as well. More recent designs who considered to be "extent based" usually do not employ allocation bitmaps and would rather write extent structure describing space allocation as range "blocks 1 to 100 000 are now used by..." (which is faster than toggling 100 000 bits) and then readjust free/busy space indexes (in some b-trees, etc).

Ive added NTFS. It does use extents but microsoft refers the extents as runs. I found this out in the inside windows 2000 book written by mark russinovoch and david soloman.--Thunderpenguin (talk) 14:12, 3 December 2008 (UTC)Reply

They seems to have something different in mind when it comes to definition of extents. Basically, I would refuse to call any bitmap-based filesystem design as extent based since each bit have to be toggled for each block, making it a old-style block-based design who does not takes benefits of addressing data via large extents defined in a compact structure. As for me, such definition of extent is flawed.

It IS however unnecessary to add SQL server, as all modern versions of Windows use NTFS anyway. You might as well say Windows 7, Vista and XP! —Preceding unsigned comment added by 192.55.54.38 (talk) 19:18, 12 November 2010 (UTC)Reply

what about LVM and DB/2?

edit

"Physical Extents" and "Logical Extents" --Jerome Potts (talk) 00:44, 15 March 2013 (UTC)Reply

NOTOC?

edit

Is there a solid reason why there was a __NOTOC__ on this article? The article is not that short. If not, there's really no reason to add it. Chininazu12 (talk) 12:17, 26 July 2016 (UTC)Reply