Summary
Cloud Computing refers
to the sum of Software as a Service (SaaS), those who provide application level
services to the end users over the Internet, and Utility Computing provided by
Cloud Providers via hardware and software systems in the data centers which is
consumed by SaaS providers in order for them to provide their service. The
illusion of infinite computing resources available on demand, the elimination
of an up-front commitment by Cloud users and the ability to pay for use of
computing resources on a short-term basis as needed are new in Cloud Computing
and separates it from previous large scale computation facilities. Similar to
the rise of TSMC, Cloud Computing is able to provide a more economic solution
to companies by leverage the statistical multiplexing of computing resources as
well as eliminating redundancy of every company having tons of IT staff to take
care of expensive and complicated servers. Thus, by creating a centralized place
for offering computing resources, cloud providers are able to uncovered a
factors of 5 to 7 decrease in cost of electricity, network bandwidth,
operations, etc. Currently, level of abstraction presented to the programmer
and level of management of resources (more specifically, computation model,
storage model, communication model) are what distinguish different Cloud
providers, such as Amazon EC2, lowest level toward hardware, Microsoft Azure in
the middle and Google AppEngine being the most high level solution mentioned in
the paper. The paper demonstrated the incentives for companies to become cloud
providers and cost-savings that moving to cloud will allow for business to
reduce their IT cost. In addition to that the paper highlighted three areas where
more innovations is needed: applications software, infrastructure software and
hardware systems, and also a new range of applications that are made possible
by Cloud Computing, including mobile interactive applications, parallel batch
processing, large scale data analytics, extension of compute-intensive desktop
applications, etc. Furthermore, another “keyword” for Cloud Computing aside
from cost-saving is elasticity which refers to the “pay as you go” model and
the benefits it brings to users when couping with demand variations, also
there’s very little or no cost penalty for using 20 times more resources for
1/20 time, thus allowing many tasks to be expedited cheaply. Lastly, the paper
went through top 10 obstacles to and opportunities for growth and highlighted
future areas of advances as well as research directions.
I really like the
paper’s “top 10 obstacles to and opportunities for growth” list since even
though I have known the concept for a long time, they offered many unique and
new insights into some challenges and shined light on future research
directions that will better aid industry in making Cloud Computing the de facto
future of large scale computing. The paper provided a really nice theoretical
foundation for thinking about cloud computing and also identifying problems
with it.
There are 3 places that
I don’t quite like about the paper. Firstly, I think it will be much more
convincing if more real usage data can be obtained and added to the paper.
Since in many places, the authors just state facts x, y and z which will
benefit greatly, if some relative information can be obtained from the Cloud
Providers (I understand this can be very hard and is a general problem when
studying industry technologies that are moving very fast, but more real usage
data can make many claims much more solid). Hidden human resource costs seem to
be ignored in many of the calculations in the paper. Secondly, I am not sure
about the use of elasticity to prevent DDoS, since elasticity can only help in
terms of application layer DDoS or computation/IO exhaustion attacks, such as
sending a bunch of computation and IO heavy queries; however, in terms of
network bandwidth attacks
Security and
confidentiality/privacy seems to be covered very briefly in this paper, which I
think is an understatement to the importance of these two properties in the
modern world. Aside from the single points of failures caused by homogenous
hardware, software stack, shared network links, data store, power grid and
collateral damage from compromised machines in the cloud, etc. More
importantly, it is very hard to convince companies that their private and
confidential data will be safe and will not be leaked. Even though the paper
suggested encrypted data, key management can still be an issue and complicated
access control can be hard to replicate on the service. It will be better if
paper can spend a bit more space looking into these issues as well. Also the 3
main models of cloud computing, model of computation, model of storage and model
of communication, mentioned in the paper, can be a useful starting point to
start looking into the security and confidentiality implications.
Reuses of the virtual
machines might have interesting implications, including side channel data
leaks, such as whether the data of the previous users will still be accessible
by the second user using existing forensic techniques, since I believe most
cloud providers do not DBAN the hard disk when an instance changes its owner.
I think the dynamic
scaling and optimization is very interesting, since in Hadoop’s example, a lot
application chain multiple map reduce jobs together to have an end-to-end
workflow; however, some stages in the pipeline might require much lesser
resources, thus maximizing the elasticity of cloud computing. In addition to
that, it will be very interesting to have some tool that can profile the
computation and find out the resource needed to achieve the maximum cost vs.
data ratio, which can be useful for cost-conscious users, and the resource
needed to achieve some threshold marginal utility when adding more machines, which
can be useful for time-conscious users. In addition to that, maybe some
advanced architecture can be devised allowing users to specify fine-grained IO
and CPU usage and giving them more elastic options in getting the resources
that will fit their workload the best. Also I am curious if there is already
any work in using program analysis and compiler techniques to automatically
parallelize an program written in conventional languages such as C, C++, or
python.
The paper mentioned the
economies behind the cloud-computing shift that has made it all possible and
made numerous calculations on the benefit it will bring to the companies. I
think it will be very interesting to ask some economists on what kind of market
the cloud providers will eventually become, given its unique properties such as
extremely high overhead, the leverage of statistical multiplexing and
reputation-sharing, I am sure there should be some economy model behind this
kind of market and maybe it will give the readers some insights what might
happen at the end.