Preparation for Recitation on GFS

Read the GFS paper here. You've seen GFS before: it is the system that MapReduce relied on to replicate files.

GFS is a system that replicates files across machines. It's meant for an environment where lots of users are writing to the files, the files are really big, and failures are common. Section 2-4 of the paper describe the design of GFS, Section 5 discusses how GFS handles failures, and Sections 6-7 detail their evaluation and real-world usage of GFS.

To check whether you understand the design of GFS, you should be able to answer the following questions: What is the role of the master? How does a read work? How does a write work?

As you read, think about:

Question for Recitation

Before you come to this recitation, write up (on paper) a brief answer to the following (really—we don't need more than a couple sentences for each question). If your TA has requested that you email your answer to them, you may do that instead, but it should still be handed in before your recitation begins.

Your answers to these questions should be in your own words, not direct quotations from the paper.

As always, there are multiple correct answers for each of these questions.