What is Serverless?
Serverless, as the name suggests, refers to serverless computing. You might wonder, how is it possible to perform computation without servers? In reality, serverless doesn’t mean there are literally no servers. Instead, it leverages technology to abstract away the concept of servers from the business logic, allowing developers to focus solely on their applications without worrying about underlying infrastructure.
How Does Serverless Work?
Since the advent of Serverless, many developers have perceived it as a novel technology, which it undoubtedly is, given its convenience. However, there’s no need to overcomplicate or be intimidated by it. The underlying logic of running applications remains unchanged. Serverless merely employs technical means to shield us from the complexities involved, much like any other cloud technology.
Before Serverless, deploying a web application was a cumbersome process. To run our application, we had to first build a runtime environment on the server-side. This entailed purchasing virtual machines, initializing their environments, installing required dependencies — ensuring consistency with our local development environment as much as possible. Next, to make our application accessible to users, we needed to purchase a domain name, register it using the virtual machine’s IP address, configure and start Nginx, and finally, upload and launch the application code.
In stark contrast to the traditional workflow, Serverless deployment requires only three simple steps, making it an extreme abstraction of server-side operations. Essentially, the entire chain of user HTTP data requests remains qualitatively unchanged; Serverless merely simplifies the overall model.
To elaborate, previously we had to build a runtime environment on the server-side, whereas FaaS applications abstract this step into function services. We used to need load balancing and reverse proxies, but FaaS applications abstract this into HTTP function triggers. Uploading code and starting the application used to be necessary, but FaaS applications abstract this into function code.
When a user accesses an HTTP function trigger for the first time, the trigger holds the user’s HTTP request and generates an HTTP Request event notification for the function service.
The function service then checks for idle function instances. If none are available, it fetches your code from the function code repository, initializes and launches a function instance, executes the function, passes the HTTP Request object as a parameter, and runs the function.
Furthermore, the function execution’s HTTP Response is returned to the function trigger, which then relays the result back to the awaiting user client.
The most significant difference between Serverless and application hosting PaaS platforms lies in resource utilization, which is Serverless’s most notable innovation. Serverless application instances can scale down to zero, whereas PaaS platforms require at least one server or container running at all times.
Before the first invocation, a function’s actual server occupancy is zero. Only when a user makes an HTTP data request is the function service triggered by the HTTP event, starting a function instance. This means that without user requests, the function service has no running instances and consumes no server resources. Conversely, creating an application instance on a PaaS platform typically takes tens of seconds, and to ensure service availability, at least one server must continuously run your application instance.
To draw an analogy, Serverless is akin to a voice-activated light that illuminates quickly when someone is present and turns off when no one is around. Compared to traditional manually operated lights, voice-activated lights excel in energy efficiency. However, this energy-saving capability hinges on the voice-activated light’s ability to quickly turn on when needed.
Similarly, the key to Serverless’s advantages lies in its rapid startup time. How does it achieve this?
Why Can Serverless Start Up So Quickly?
Cold start is originally a PC concept, referring to the process of reloading the BIOS table — essentially booting from the hardware drivers — after a power cycle, resulting in slow startup times.
In today’s cloud environments, power cycling physical servers is almost unheard of. In the context of Serverless, cold start refers to the entire process from function invocation to function instance readiness. Our focus here is on minimizing the startup time, as shorter startup times directly translate to higher resource utilization. Current cloud providers, leveraging language-specific optimizations, have achieved average cold start times between 100 and 700 milliseconds. Thanks to Google’s Just-In-Time compilation in its JavaScript engine, Node.js boasts the fastest cold starts.
It’s worth noting that a Serverless service can start from scratch, execute a function, and complete the process within 100 milliseconds — a key reason why Serverless can confidently scale down to zero. When opening a webpage, a response time of under one second is generally considered excellent. In this context, a 100-millisecond startup time has a negligible impact on page load times.
Moreover, it’s safe to assume that cloud providers will continue to optimize their infrastructure for even faster startup times, ultimately leading to higher resource utilization. For instance, downloading function code is a time-consuming step during cold starts. Therefore, upon code updates, cloud providers often proactively initiate resource scheduling to download and build container images for your function instances. When the first request arrives, they can leverage these cached images, bypassing the code download step of a cold start and launching the container directly from the image. This technique is known as warm start. Consequently, for latency-sensitive applications, we can utilize warm starts or instance pre-warming strategies to accelerate or circumvent cold start times altogether.
How is Serverless Layered?
When your Serverless instance executes, it comprises at least three layers: container, runtime, and function code.
Think of the container as the operating system (OS). Code execution requires interaction with hardware, and the container simulates the kernel and hardware information, allowing your code and runtime to function within it. Container information includes memory size, OS version, CPU details, environment variables, and more. Currently, FaaS implementations may utilize Docker containers, virtual machines (VMs), or even sandbox environments.
The runtime represents the context in which your function executes. Runtime information includes the programming language and version used, such as Node.js v10 or Python 3.6; callable objects, such as the aliyun SDK; and system information like environment variables.
What are the benefits of this layered approach? The container layer offers wider applicability, enabling cloud providers to pre-warm numerous container instances, effectively fragmenting physical server resources. Runtime instances, with their lower applicability, can be pre-warmed in smaller numbers. Once the container and runtime are fixed, downloading and executing the code becomes straightforward. This layered architecture enables efficient resource optimization, allowing for rapid and cost-effective execution of your code.
Summary
A pure Serverless application’s invocation chain consists of three main components: function triggers, function services, and function code. These respectively replace the traditional server-side operations of load balancing & reverse proxies, servers & application runtime environments, and application code deployment.
The most significant difference between Serverless and traditional application hosting PaaS platforms lies in the ability of Serverless applications to scale down to zero and rapidly start up upon event triggers. Node.js functions, for instance, can achieve startup and execution within 100 milliseconds.
Serverless, by design, sacrifices user control and application scope to simplify the code model. Its layered structure further enhances resource utilization, which is a primary contributor to its remarkably short cold start times.
Novita AI will launch Serverless service for our users, join the waitinglist now and start your business with serverless computing.
Originally published at Novita AI
Novita AI is the All-in-one cloud platform that empowers your AI ambitions. Integrated APIs, serverless, GPU Instance — the cost-effective tools you need. Eliminate infrastructure, start free, and make your AI vision a reality.