Storage of very large volumes of information | Publishing house Radiotekhnika

350 rub

Journal Science Intensive Technologies №4 for 2010 г.

Article in number:

Storage of very large volumes of information

Keywords: Petabytes of data Data depositories Distributed file system Fault tolerant system

Authors:

V.А. Kozmidiady

Abstract:

Principles of construction of the centralized depositories in which it is possible to keep some tens of petabytes of information are examined in the article. Requirements to such depository: - a maximal volume of the kept information can be about 10 PB. The easy scaled of volume must be well-to-do ot the hundreds of TB to 10 PB; - access to the depository must be carried out both through a network and locally; - access time must be not worse, than at the existent systems of storage. Possibly, it is possible to assume sublogarithmic growth of this time with depending on a volume; - high fault-tolerance must provide work in the mode 24/7/365 (24 hours a day, 7 days a week, 365 days a year). Failures and refusals must not result in the loss of data; - a cost of storage of data in a depository must be of the same order, as the cost of storage on an ordinary hard disk. Difficulties of storage of data of such volume are examined in the distributed file systems. For achieving the indicated volumes it is suggested in a number of cases to give up the observance of standard of POSIX. Basic deviations followings from POSIX: - refusal of directory tree of the file system and passing to the flat model, i.e. refusal of the file system hierarchy. Refusal of support of hard and symbolic links; - refusal of file name and passing to the identifier of data set. This identifier is constructed at creation of data set and returns an user; - assumption, that different users in the same moment of time can see different content; - limitation of nomenclature of functions, workings with a depository. Exemplary composition: creation of data set with pointing of his content, reading of data set, elimination of data set, modification of data set by replacement of content. In addition, it is suggested to provide reliable and trouble-free storage of data due to maintenance for every set of information a few copies. In the article principles of work of the distributed depository of information and operation of access are offered to the depository. The special attention is spared providing of depository fault-tolerance, i.e. to the conduct of depository at the fall of nodes and their recovery.

Pages: 4-18

References

Price D. J. d. S., Little science, big science. New York: Columbia University Press. 1963. 118 p.
Schmuck F., Haskin R. GPFS: A Shared-Disk File System for Large Computing Clusters // Proceedings of the FAST'02 Conference on File and Storage Technologies.Monterey. California. USA. 2002.P. 231-244.
Wilcke W. W., Garner R. B., Fleiner C., Freitas R. F., Golding R. A., Glider J. S., Kenchammana-Hosekote D. R., Hafner J. L., Mohiuddin K. M., Rao KK, Becker-Szendy R. A., Wong T. M., Zaki O. A., Hernandez M., Fernandez K. R., Huels H., Lenk H., Smolin K., Ries M., Goettert C., Picunko T., Rubin B. J., Kahn H., Loo T. IBM Intelligent Bricks project - Petabytes and beyond // IBM Journal of Research and Development. 2006. V. 50. N. 2/3. P. 181?197.
OneFS distributed file system. - http://en.wikipedia.org/wiki/OneFS.
Ghemawat S., Gobioff H., Shun-Tak Leung Google File System // Proceedings of the nineteenth ACM symposium on Operating systems principles. Bolton Landing. New York. USA. 2003. P. 29-43.
Lustre File System, 2009. http://www.sun.com/software/products/lustre.
IEEE Std 1003.1,2004 Edition. http://www.unix.org/version3/ieee_std.html.
Patterson D. A., Gibson G., Katz R.H. A Case for Redundant Arrays of Inexpensive Disks (RAID) // Proceedings of the 1988 ACM SIGMOD international conference on Management of data. Chicago. 1988. P. 109-116.
Lamport L. Time, clocks, and the ordering of events in a distributed system // Communications of the ACM. 1978. V. 21. No. 7. P. 558-565.
Ebbers M., O'Brien W., Ogden B. Introduction to the New Mainframe: z/OS Basics. Vervante. 2007. 712 p.
Solaris ZFS Administration Guide, 2008. http://dlc.sun.com/pdf/817-2271/817-2271.pdf.
Large data storage in FreeBSD. - http://www.freebsd.org/projects/bigdisk/index.html.
Western Digital RE4-GP (WD2002FYPS) 2TB hard drive. 2009. http://www.pcworld.idg.com.au/review/servers_storage/western_digital/re4-gp_wd2002fyps/309219.
Intel 64 and IA-32 Architectures Software Developer's Manual. Volume 3A: System Programming Guide. http://www.intel.com/products/processor/manuals/.
Lamport L. The part-time parliament // ACM Transactions on Computer Systems. 1998. V. 16. No. 2. P. 133-169.
Lamport L. Paxos made simple // ACM SIGACT News. 2001. V.32. No. 4. P. 18-25.
Skeen D., Stonebraker M. A Formal Model of Crash Recovery in a Distributed System // IEEE Transactions on Software Engineering. 1983. V. 9. No. 3. p. 219-228.
Gray J., Lamport L. Consensus on Transaction Commit // ACM Transactions on Database Systems. 2006. V. 31. No. 1. P. 133-160.
Gallager R. G., Humblet P. A., P. M. Spira P. M. A Distributed Algorithm for Minimum-Weight Spanning Trees // ACM Transactions on Programming Languages and Systems. 1983. V. 5. N. 1. P. 66?77.
Korach E., Shay Kutten S., Moran S. A modular technique for the design of efficient distributed leader finding algorithms // ACM Transactions on Programming Languages and Systems. 1990. V. 12. N. 1. P. 84-101.
Arts T., Claessen K., Svensson H. Semi-formal Development of a Fault-Tolerant Leader Election Protocol in Erlang // Formal Methods and Software Engineering: 4th International Conference on Formal Engineering Methods. Berlin / Heidelberg: Springer. 2002. P. 140 - 154.
FreeBSD Man Pages. http://www.freebsd.org/cgi/man.cgi.
Massie M., Chun B., Culler D. The ganglia distributed monitoring system: design, implementation, and experience // Parallel Computing. 2004. V. 30. N. 7. P. 817-840.
Lamport L.,Pease M., Shostak R. The Byzantine Generals Problem // ACM Transactions on Programming Languages and Systems. 1978. V. 4. N. 3. P. 382-401.
Захаров В.Н., Козмидиади В.А., Кузьмин А.В., Попов А.С., Шулятников Д.С. Планирование выполнения заданий сервисных приложений в распределенной среде // Системы и средства информатики: Вып. 18 / Под ред. И.А. Соколова. М.: Наука. 2008. C. 37 - 48.
Беленков В., Будзко В., Синицин И. Проблемы создания катастрофоустойчивых крупномасштабных автоматизированных систем банковских расчетов // Системы и средства информатики: Вып. 12 / Под ред. И.А. Соколова. М.: Наука. 2002. C. 48 - 57.
Super Talent Announces World-s First Commercially Available 512GB 2.5-Inch SSD. 2009. - http://www.supertalent.com/press_view.php-prid=202cb962ac59075b964b07152d234b70&lid=c4ca4238a0b923820dcc509a6f75849b