Under the Stress of Reform: High-Performance

0 downloads 0 Views 933KB Size Report
electronic digital computing era in the r mid-1940s, a ... electronic digital computers in the late-1940s under ... and the Elektronika SSBIS ("Red. Cray"), could ...
International Perspectives

Underthe stressof

Ref0rm"High m

Performance

Computingin the

m

LC%IIIXUAV~y

U~, v ~ , x u ~

ment and have introduced new problems w h i c h have b r o u g h t H P C oz development and manufacture nearly to a halt. Since the beginning of the m o d e r n = electronic digital computing era in the rm mid-1940s, a n u m b e r of governments adopted variations of the Soviet-style,

centralized-directive economy for at least the " c o m m a n d i n g heights" of their national industries. M a n y of

P. Wolcott S.E. Goodman

countries in:d h e a v i l y in kic development a p p l i c a t i o n of Lology, and techgical p r o g r e s s t key element in political/ecoic i d e o l o g i e s . H P C sector of 'SU provides an ,ple of one of the established and ,ly e m b e d d e d -technology seein any of these ~mies. It sought ~rovide, w i t h ed success, comlg power for the R's space prol, the militarystrial sector, the :l's most extenprogram for exing natural reces, a n d t h e . . . . . d's largest scientific community, a m o n g others. T h e p r o b l e m s of t r a n s f o r m a t i o n experienced by this sector are, to greater or lesser . d e g r e e s , s h a r e d by o t h e r advanced technology sectors in the F S U and elsewhere. All contenoed with the constraints of the Soviet-style, centralized-directive system and now

COMMUNICATIONS

O F T H I I A C M O c t o b e r 1993/Vol.36, No.10

25

International bear the burden of their past as they struggle to become viable in their current, often traumatic, and still very unsettled new economic contexts. This article considers the problematic history, response to reform, and future prospects of HPC in the FSU.

Early Successes, But a Growing "Computer Gap" Under the direction o f S.A. Lebedev, Soviet engineers began developing electronic digital computers in the late-1940s u n d e r extremely trying conditions in war-torn Kiev. Since 1950, the most powerful Soviet computers have been developed u n d e r Lebedev and his successors at the Institute of Precision Mechanics and Computer Technology (ITMVT) in Moscow. By the mid-1960s, I T M V T engineers developed the BESM-6, a one MIPS system which when introduced, was, close to worldwide stateof-the-art in HPC. T h r o u g h the early-1980s, over 350 BESM-6s were manufactured and most remained in use until around 1990. A workhorse of Soviet ..scientific computing, this machine probably logged more hours of use per unit manufactured than any other serially produced computer iin the world. Research and development of a number of high-performance systems covering a wide range of architectural approaches were initiated during the 1970s and 1980s at I T M V T and other institutes within the industrial ministries, the Academy o f Sciences, and the Ministry of Higher Education. T h e number of prototypes completed increased dramatically during the 1980s, although only those designed within industrial ministries reached series production. A subset of recent systems is shown in Table 1. A c o m m o n thread through most projects was the goal of achieving high performance and reliability through parallelism. In spite of being a priority sector, having m~my of the most talented engineers in Soviet computing, and enjoying close ties to the military industries, the HPC sector was not capable of overcoming basic systemic and technological difficulties to provide the country with high levels of

26

October 1993/Vol.36, No.10 ¢ : O M M U N I C A T I O N S

Perspectives

computational power. T h e aggregate peak performance of series production machines manufactured in 1991--when Soviet HPC production was at its height--was over two orders of magnitude less than that of Cray Research, Inc. alone. Two systems that have been prototyped, the MKP (Modular Pipeline Processor) and the Elektronika SSBIS ("Red Cray"), could increase this number considerably, but are not likely to enter series production. By 1992, orders for the two main systems in series production, the El'brus-2 and the PS-2100, declined to almost zero. Furthermore, most late-model high-end systems were placed in restricted installations or used for dedicated applications. Much of the civilian scientific community had to rely on aging BESM-6s or mid-sized functional duplicates of IBM mainframes.

HPC Development in the Soviet Context As a result of both Western and Soviet national security concerns and controls, the USSR established a complete, autonomous indigenous computer industry. T h e components, materials, and technologies used to build computers in the highperformance sector were supplied by domestic industries. T h e systems were very complex, and the upstream infrastructure consisted of hundreds of organizations. In the absence of Western-like market factors which compelled organizations to upgrade their technology or lose customers, high-profile end products were often used to drive Soviet technological advances. T h e El'brus series is the most prominent example in HPC. These computers provided an application which d e m a n d e d improved technology. El'brus engineers also pioneered R&D for supporting technologies later used, for example, in high-end Although necessary within the Soviet context, this strategy forced the development of prototype machines that incorporated a much higher level of immature technologies than is practiced in the West. Designers had to deal not only with problems in architecture and con-

OPTHIm A ¢ I M

struction faced in any new system project, but also with c o m p o u n d e d problems introduced by design flaws or low reliability in constituent technologies. Development time for machines playing this role was often 10 years or more. In order to keep pace with rapid developments in the West, at least on paper, Soviet engineers designed new machines that had target performances on a par with advanced Western machines. Given the long development cycles, this meant that a machine would have to have a peak performance one-to-two orders of magnitude greater than its predecessor. Such long strides placed additional stress on supporting industries, aggravating problems with the technological base. Organizing R&D and production of new technologies at supporting facilities was a technological and administrative nightmare. Arranging for the development o f new technologies required negotiations at each step along a lengthy bureaucratic path through ministry, Communist Party, and government planning agency hierarchies. Even when all necessary approvals had been obtained, individual production facilities exercised nontrivial control over production schedules. Factories, especially those with better reputations or which manufactured critical items, had more orders than they could fill. T h e factory director had to choose to fill some orders before others. Retooling for the production of new technologies cost time and effort, and required the termination of some existing production, possibly causing state-dictated production targets to be missed with serious financial consequences for both management and workers. Because each factory also had to depend on many other suppliers, assimilating new production was stressful and risky. Factory directors tended to favor the manufacture of simpler products already in production. Overcoming this disinclination to assimilate new technologies required working the higher levels of bureaucracy to bring appropriate pressure on factories. T h e monopolistic structure of the Soviet economy further complicated

International Perspectives Sources: [1,4]

TUble !: Late Soviet HPC Systems

*Systemreachedseriesproduction matters. A high percentage of inputs to HPC were manufactured at one, or a small number, of facilities. If delays occurred anywhere, there were few alternatives to waiting, applying political pressure, or seeking to develop the necessary technology in-house. Each of these approaches cost time and effort, and increased uncertainty. Finding alternative suppliers was usually not an option. Projects within the Academy of Sciences typically faced additional obstacles. The administrative distance between Academy R&D facilities and the factories in industrial ministries on which they depended for prototype construction was great. The factories tended to resist working on the Academy's "exotic" architectures. Designed first and foremost to demonstrate novel machine design principles, the academic systems often did not deliver the "real-world" performance claimed, were too difficult to program, or otherwise did not suit "real-world" users with demanding applications. T h e Soviet centralized-directive economic system was established on the belief that it would lead to greater coordination and efficiency of the economy, and rapid advance in priority areas through focused al-

location of resources. T h e HPC sector demonstrates that in practice it often produced none of these benefits, even within the military sphere. T h e system failed in part because resources were not infinite, and there was too much to prioritize. HPC competed for resources with other advanced technologies, both civilian and military. Stern admonishments from ministers and members of the Military-Industrial Commission could hasten development of one project for a time, but often at the cost of delays in others. Delays in any link of the development chain affected the time of completion of the end product. The PS-2000 was one of the few projects which combined a practical, industrial orientation with modest demands on supporting industry, close relations between R&D and production facilities, and high-level support during the R&D phase. It was one of the most successful Soviet parallel systems. The system passed acceptance testing only four years after development was initiated and about 200 machines were manufactured during the 1980s. Its successor, the PS-2100, required the development of new components and had a longer development period.

HPC Under Reform and Transition A key goal of the perestroika reforms was revitalizing the Soviet Union's economy by increasing efficiency through automation and technological advance rather than by increasing inputs such as labor and capital. Gorbachev sought to accelerate technological advance both through centralized and decentralized measures. He initiated the creation of largescale interindustry organizations focused on specific technologies (e.g., robotics, personal computers), encouraged organizations involved with different stages of product life cycles to join together into so-called production and scientific-production associations, and increased the importance of long-term comprehensive plans for focusing resources on specific technologies, including highperformance computing. At the same time, he supported legislation that would give individual enterprises, institutes, and subenterprise organizational units greater control over and responsibility for their own structure, activities, relations with foreign and domestic organizations, and financial dealings. T h e reforms unleashed forces which caused a fragmentation of

¢OMllUNICATION|

O i I T H I A C M O c t o b e r 1993/Vol.36, No.10

27

International Perspectives many traditional pillars of Soviet society: the centralized planning mechanism, the political control of the Communist Party, the directed economic relationships between organizations, anti the administrative hierarchies within the industrial ministries. ]In particular, the computing industry is now roughly split into two sectors, one essentially the remnant of the old state sector based on government ministries and the Academy, and the other, a growing "mixed" sector of private, state and foreign activities [2]. These trends have been accelerated since the break-up o f the former USSR in late1991. So far, these changes have brought both opportunities and severe problems to HPC. T h e centralized-directive form o f economic management has to a significant degree been replaced by one based on economic considerations. Enterprises and institutes have the ability to interact with other organizations directly, rather than through administrative hierarchies. Feedback between customers and providers has increased as customers become more demanding and providers develop excess production and development capacity. T h e reforms have brought new flexibility to the organization of R&D, and temporary, task-oriented teams are now used widely in place of rigid laboratory structures. Opportunities for contact with the international community have increased dramatically, opening the door for better professional interaction, foreign investment, and joint projects. However, these benefits so far have been overshadowed by an economy under:going traumatic changes, the evaporation of demand for virtually all domestically manufactured computers, sharp decreases in state funding, and fundamental weaknesses in tile supporting infrastructure. Given severe reductions in the military budget, state orders for HPC declined to nearly zero by the end of 1992. Many organizations have no funds to spend on computing, and those that do are likely to purchase or otherwise obtain foreign-made machines such as personal computers or workstations. (For example,

8

October 1993/%1.36, No.10 C O M M U N I C A V l O N S

Sun Microsystems reports that it has shipped over 1,000 SPARC-based workstations to the Commonwealth of I n d e p e n d e n t States.) Funding for R&D has continued for some projects nearing completion, especially those designed by industry, but at levels that are enough to keep teams intact but not high enough to make them very productive. Many projects, such as those carried out under the S T A R T program featured in the J u n e 1991 issue of Communications [3], have been terminated for lack of funding. Funding for new large-scale projects is rarely available. Under these circumstances, HPC R&D facilities have sought ways to retain personnel and preserve capability. To varying degrees, they have transformed their organizational structures from a unified hierarchy of divisions and laboratories into a collection of loosely-coupled organizational units, typically formed on the basis of existing development teams or divisions, each with considerable autonomy and control over financial affairs. Such units have been able to take advantage of more favorable tax and wage regulations and have decentralized responsibility for finding funding, easing the burden of the institute's leadership. By remaining u n d e r the umbrella of the old institute, the new units can have the advantages of the former's name recognition and physical facilities. One prominent example is the Moscow SPARC Center. Formed around a core of El'brus engineers, this small enterprise at I T M V T is currently working on microprocessor design and software development projects for Sun Microsystems, Inc. It exists autonomously, renting facilities from ITMVT. T h e organizational changes present a dilemma for R&D facilities. On one hand, a structure based on loosely-coupled, autonomous units is more viable as plentiful, large-scale government funding gives way to relatively scarce, smaller-scale contracts. On the other hand, the proliferation of small autonomous units compromises the ability o f an institute as a whole to carry out the R&D needed for large-scale, integrated projects. While the contracts with Sun Microsystems have been benefi-

Oil THIE AClM

cial in providing much needed funding, challenging projects to maintain engineers' skills, and international exposure and contacts, they will not result in the creation of a new Russian supercomputer. Some skills may be transferable to other projects, but Sun is not funding development of a complete system, nor one that could be constructed using manufacturing technology available in Russia today. Other structural problems remain that fundamentally compromise Russia's ability to carry out largescale HPC development. Under the Soviet system, R&D facilities were tied to production facilities by the centralized planning system. In scientific production associations, they were also united administratively on a more local level. In no case, however, were the proceeds of sales from production factories channeled directly back into R&D. Funding for R&D was provided specifically for that purpose by ministries or principal sponsors. In post-USSR Russia, administrative ties between R&D and production facilities have been broken. Although factories are now able to retain proceeds from sales, R&D facilities must find alternative sources of income. Without strong links to production facilities it will be difficult to move research results into production and recover the investment. Relationships with upstream industries have become more marketoriented, but the infrastructure remains poorly suited to providing all of the technology needed for HPC development. T h e microelectronics industry continues to be unable to manufacture large quantities o f components with high levels of integration, and few domestic alternatives exist if a supplier experiences delays or is unable to manufacture the necessary technology. Furthermore, in the absence of a market for domestic HPC, many microelectronics factories prefer to concentrate on technologically modest chips for higher volume consumer electronics.

Prospects The increasingly fragmented HPC sector is losing the ability to design and manufacture high-performance

International Perspectives systems. T h e Soviet system placed many obstacles in developers' paths, but sustained the sector t h r o u g h its funding and planning mechanisms and a captive customer base. This protective system is being stripped away, exposing the HPC sector to the opportunities and harsh realities o f a new economic system which is more competitive, dynamic, international, and uncertain. Long-term prospects are not necessarily bleak; Russia especially has many competent engineers and a great potential need for HPC cycles for an e n o r m o u s variety o f applications. However, the road to viability will be long and difficult. T h e Russians' short-term strategy is to keep core d e v e l o p m e n t teams intact until basic economic conditions improve t h r o u g h some combination of subsistence g o v e r n m e n t funding, credits, foreign investment, smallscale contract work, and commercial resale activities. Longer-term survival for both industrial and academic players will d e p e n d on their ability to develop systems which are competitive within some scientific or commercial niche. Industry will need to establish the integrity o f the research-development-manufacturing cycle if p r o d u c t sales are to be used to fund R&D. However, the government is losing its ability to o r d e r the fusion o f R&D and production facilities. As decomposition and privatization progress, providing R&D facilities with largescale production facilities will have to be accomplished by acquisition, construction, or strategic alliance. T h e first two require funds currently unavailable; the latter is unlikely in light o f the poor market for HPC products. Reducing d e p e n d e n c e on indigenous industries is a key precondition to competitiveness. Although in the future it is possible that advanced technologies may be m a n u f a c t u r e d in volume in Russia, as long as the domestic upstream industries are not able to provide the necessary technologies in a timely m a n n e r the HPC sector must find alternatives. Furthermore, by using only domestic technology, HPC developers will not be able to ride the waves o f interna-

tional technological advance. Technological advance and changing geo-political relationships have increased the availability o f mass p r o d u c e d Western technologies. It has become difficult for CoCom export controls to prevent or significantly slow the flow o f products like powerful microprocessors or scientific workstations that are made in large numbers. It is becoming increasingly possible to build parallel processors using predominantly commercial technologies. I n c o r p o r a t i n g Western technology is no panacea. Most advanced systems require at least some customized components. Acquiring Western technology is still m o r e difficult and expensive in Moscow than in Silicon Valley. A d o p t i n g Western technology will require fundamental shifts in design philosophies and strategies. It requires giving up the belief that HPC should drive a broad spectrum of domestic industries. It is likely to reduce the diversity o f architectural approaches. In the near term at least, it also will mean giving u p attempts to build expensive, complex machines like those from Cray or NEC. More feasible projects are those incorporating much higher percentages o f off-the-shelf components where most o f the challenges are in software and algorithms, which are less d e p e n d e n t on upstream industries.

Follow-ups and Pointers This article benefited enormously from multiple visits over the last half dozen years to the organizations and people who developed the machines listed in Table 1. We are grateful to the many people who spent so much time with us u n d e r often difficult circumstances. Many scientific and technological fields in the FSU are suffering from the problems this column discusses. Taken together with the greater f r e e d o m for scientists and engineers to have foreign contacts, the science and technology communities o f Russia and other Newly I n d e p e n d e n t States have been further d a m a g e d by severe "brain drains" as talented people look for better working conditions abroad. For example, two o f the

three authors o f articles on the S T A R T Project featured in the J u n e 1991 issue o f Communications [3] have f o u n d long-term work in the U.S. and Sweden. However, some o f these "drains" have helped to create new opportunities for the R&D teams back in the FSU (e.g., one o f the f o r m e r S T A R T leaders has helped people at his Russian institute obtain research contracts from the

u.s.). Additional information on past columns: there have been some notable additions to the use o f the information technologies in the Middle East since the August, 1992 coverage o f that region in "International Perspectives." In particular, there is now a Bitnet connection with Iran, and the restrictions on use o f faxes in Syria have been relaxed. We have had such an extensive response to the F e b r u a r y 1993 "International Perspectives" column on computing in sub-Saharan Africa that we intend to revisit that part o f the world at least once before the end o f next year. I n f o r m a t i o n from readers on this part o f the world is welcomed. Readers are encouraged to send comments, suggestions, anecdotes, insightful speculation, raw data, and submissions for guest columns on any subject relating to international aspects o f the information technologies. All correspondence should be addressed to: Sy G o o d m a n MIS/BPA University o f Arizona, Tucson, AZ 85721, g o o d m a n @ b p a . a r i z o n a . e d u or fax: 602-621-2433. r'.l References

1. Dorozhevets, M.N., Wolcott, P. The Erbrus-3 and MARS-M: Recent advances in Russian high-performance computing. J. Supercomput. 6 (1992), 5-48. 2. Goodman, S.E. and McHenry, W.K. The Soviet computer industry: A tale of two sectors. Comm. ACM 34, 6 (June 1991), 25-29. 3. Kotov, V. Project START. Comm. ACM 34, 6 (June 1991), 30-31. The same issue of Communications also features three technical articles by Kotov, E. Tyugu, and A.S. Narin'yani. 4. Wolcott, P. Soviet advanced technology: The case of high-performance computing. Ph.D. dissertation, The University of Arizona, June 1993.

ClOMMUNICAS'JONS

OI t TNIE ACM

October 1993/Vol.36, No.10 2 9