Google云计算解决方案6_分布式文件系统汇编
合集下载
相关主题
- 1、下载文档前请自行甄别文档内容的完整性,平台不提供额外的编辑、内容补充、找答案等附加服务。
- 2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
- 3、如文档侵犯您的权益,请联系客服反馈,我们会尽快为您处理(人工客服工作时间:9:00-18:30)。
Google Cluster Computing Faculty Training Workshop
Module VI: Distributed Filesystems
Outline
• Filesystems overview • NFS & AFS (Andrew File System) • GFS
Drive-wide:
Available space Formatting info character set ...
Per-file:
name owner modification date physical layout...
High-Level Organization
• Files are organized in a “tree” structure made of nested directories • One directory acts as the “root” • “links” (symlinks, shortcuts, etc) provide simple means of providing multiple access paths to one file • Other file systems can be “mounted” and dropped in as sub-hierarchies (other drives, network shares)
– inodes store all block ids related to file
© Spinnaker Labs, Inc.
Fragmentation
A B C (free space)
A
B
C
A
(free space)
A
(free space)
C
A
(free space)
A
D
C
A
DБайду номын сангаас
(free)
Design Considerations
• Smaller inode size reduces amount of wasted space • Larger inode size increases speed of sequential reads (may not help random access) • Should the file system be faster or more reliable? • But faster at what: Large files? Small files? Lots of reading? Frequent writers, occasional readers?
© Spinnaker Labs, Inc.
Low-Level Organization (1/2)
• File data and meta-data stored separately • File descriptors + meta-data stored in inodes
– Large tree or table at designated location on disk – Tells how to look up file contents
© Spinnaker Labs, Inc.
What Gets Stored
• User data itself is the bulk of the file system's contents • Also includes meta-data on a drive-wide and per-file basis:
• Meta-data may be replicated to increase system reliability
© Spinnaker Labs, Inc.
Low-Level Organization (2/2)
• “Standard” read-write medium is a hard drive (other media: CDROM, tape, ...) • Viewed as a sequential array of blocks • Must address ~1 KB chunk at a time • Tree structure is “flattened” into blocks • Overlapping reads/writes/deletes can cause fragmentation: files are often not stored with a linear layout
– Absolute: /home/aaron/foo.txt – Relative: docs/someFile.doc
• The shortest absolute path to a file is called its canonical path • The set of all canonical paths establishes the namespace for the filesystem
– Addressable by a filename (“foo.txt”) – Usually supports hierarchical nesting (directories)
© Spinnaker Labs, Inc.
File Paths
• A file path joins file & directory names into a relative or absolute address to identify a file
© Spinnaker Labs, Inc.
File Systems Overview
• System that permanently stores data • Usually layered on top of a lower-level physical storage medium • Divided into logical units called “files”
Module VI: Distributed Filesystems
Outline
• Filesystems overview • NFS & AFS (Andrew File System) • GFS
Drive-wide:
Available space Formatting info character set ...
Per-file:
name owner modification date physical layout...
High-Level Organization
• Files are organized in a “tree” structure made of nested directories • One directory acts as the “root” • “links” (symlinks, shortcuts, etc) provide simple means of providing multiple access paths to one file • Other file systems can be “mounted” and dropped in as sub-hierarchies (other drives, network shares)
– inodes store all block ids related to file
© Spinnaker Labs, Inc.
Fragmentation
A B C (free space)
A
B
C
A
(free space)
A
(free space)
C
A
(free space)
A
D
C
A
DБайду номын сангаас
(free)
Design Considerations
• Smaller inode size reduces amount of wasted space • Larger inode size increases speed of sequential reads (may not help random access) • Should the file system be faster or more reliable? • But faster at what: Large files? Small files? Lots of reading? Frequent writers, occasional readers?
© Spinnaker Labs, Inc.
Low-Level Organization (1/2)
• File data and meta-data stored separately • File descriptors + meta-data stored in inodes
– Large tree or table at designated location on disk – Tells how to look up file contents
© Spinnaker Labs, Inc.
What Gets Stored
• User data itself is the bulk of the file system's contents • Also includes meta-data on a drive-wide and per-file basis:
• Meta-data may be replicated to increase system reliability
© Spinnaker Labs, Inc.
Low-Level Organization (2/2)
• “Standard” read-write medium is a hard drive (other media: CDROM, tape, ...) • Viewed as a sequential array of blocks • Must address ~1 KB chunk at a time • Tree structure is “flattened” into blocks • Overlapping reads/writes/deletes can cause fragmentation: files are often not stored with a linear layout
– Absolute: /home/aaron/foo.txt – Relative: docs/someFile.doc
• The shortest absolute path to a file is called its canonical path • The set of all canonical paths establishes the namespace for the filesystem
– Addressable by a filename (“foo.txt”) – Usually supports hierarchical nesting (directories)
© Spinnaker Labs, Inc.
File Paths
• A file path joins file & directory names into a relative or absolute address to identify a file
© Spinnaker Labs, Inc.
File Systems Overview
• System that permanently stores data • Usually layered on top of a lower-level physical storage medium • Divided into logical units called “files”