Synchronization is a means to problems that arise from building concurrent systems. In this post, I'll paint the big picture by explaining what concurrency means, what concurrent programming languages are, and the role of synchronization in concurrent systems. In the next post, we'll learn about concurrency in a JavaScript web application. Finally, in the third post we'll examine the role of promises as a synchronization mechanism in JavaScript.
Concurrency
Concurrency is a property of a computing system. In Computing, when we say that a system is concurrent, we're saying that there are entities (computations) in the computer that are independent of each other that may also be interacting. It's how the system is structured in the abstract - the processes, agents, or actors involved in running the show. This is not the same as parallelism. Parallelism refers to how those entities are actually executed (at the same time). This is a subtle but important difference. A concurrent system may not necessarily be parallel depending on how it's implemented.
Lets take a look at a few examples involving:
Lets start with a real world system rather than a computer. Say you want to start a coffee shop.
In the simplest scenario, you are the sole employee of the coffee shop. You are doing everything from taking orders to making coffee. This structure is not concurrent because there is only a single actor in this picture: you.
Running a coffee shop by yourself is hard, so lets hire a cashier. Now we have a concurrent system because there are two employees in the coffee shop. Note that I didn't mention anything about how these two employees are going to interact. Why? Because it doesn't matter if we're talking about concurrency. What matters is the structure - how we've modeled the system. The fact that there are now two people makes this system concurrent. To understand this in the context of a computing system, you just have to replace "person" with "computation".
If we have multiple people, does that mean the system must also be parallel? No!
Lets say we make up a rule which dictates that only one employee can be working at any given time. This means that if the cashier is taking customers, the coffee maker cannot be making coffee and vice versa. This is analogous to a computing system where only one computation can be running at any given time. Of course, we all know how silly this is. What we really want is a concurrent, parallel system. The reason we hire anyone in the first place is not just to reduce the amount of work we have to do but also improve overall efficiency (same or greater output in less time). Therefore, we don't just want two people doing work - we want them doing work in parallel. The cashier must able to take orders while the coffee maker makes coffee.
Concurrency has nothing to do with execution and everything to do with structure. Any system with multiple actors is by definition concurrent regardless of whether it's a computing system or real world system. If it's a computing system, it's still concurrent regardless of how the computations are executed: on single core or multi-core, using processes and/or threads. However, a concurrent system that is not run on a computer isn't very useful because it doesn't do anything. The purpose of creating a concurrent structure in the first place is to implement it, which requires the use of a programming language.
Concurrent programming languages
Lets take a look at a few examples involving:
- A non-concurrent system
- A concurrent, non-parallel system
- A concurrent, parallel system
Lets start with a real world system rather than a computer. Say you want to start a coffee shop.
In the simplest scenario, you are the sole employee of the coffee shop. You are doing everything from taking orders to making coffee. This structure is not concurrent because there is only a single actor in this picture: you.
Running a coffee shop by yourself is hard, so lets hire a cashier. Now we have a concurrent system because there are two employees in the coffee shop. Note that I didn't mention anything about how these two employees are going to interact. Why? Because it doesn't matter if we're talking about concurrency. What matters is the structure - how we've modeled the system. The fact that there are now two people makes this system concurrent. To understand this in the context of a computing system, you just have to replace "person" with "computation".
If we have multiple people, does that mean the system must also be parallel? No!
Lets say we make up a rule which dictates that only one employee can be working at any given time. This means that if the cashier is taking customers, the coffee maker cannot be making coffee and vice versa. This is analogous to a computing system where only one computation can be running at any given time. Of course, we all know how silly this is. What we really want is a concurrent, parallel system. The reason we hire anyone in the first place is not just to reduce the amount of work we have to do but also improve overall efficiency (same or greater output in less time). Therefore, we don't just want two people doing work - we want them doing work in parallel. The cashier must able to take orders while the coffee maker makes coffee.
Concurrency has nothing to do with execution and everything to do with structure. Any system with multiple actors is by definition concurrent regardless of whether it's a computing system or real world system. If it's a computing system, it's still concurrent regardless of how the computations are executed: on single core or multi-core, using processes and/or threads. However, a concurrent system that is not run on a computer isn't very useful because it doesn't do anything. The purpose of creating a concurrent structure in the first place is to implement it, which requires the use of a programming language.
Concurrent programming languages
The implementation of concurrent systems is known as concurrent programming (surprise!). There are some languages that are specially designed for that task and they're called concurrent programming languages. These languages have built-in language constructs designed to support the creation of independent processes as well as control over their interactions. One such language is Erlang, which has a set of primitives built into the language just for spawning processes and coordinating the communication between them. We can use languages like Erlang to implement a concurrent system such as a coffee shop simulation.
We do not, however, need to use a concurrent programming language to write a concurrent system. A concurrent programming language may have special support for concurrency, but that's not necessary. Many languages (C, Python, Java, JavaScript) are not considered "concurrent programming languages" because they were not designed for handling concurrency but still enable you to implement concurrent computations through an API (library, host environment). For example, in Java there are concurrency APIs with useful methods for dealing with concurrent programming bundled in a concurrency library. Writing concurrent systems in a non-concurrent programming language like Java may be more difficult but not impossible.
The reason there is additional support for concurrency in programming languages (beyond just being able to create a new thread or process) at all is that implementing a concurrent system introduces problems managing the interactions of computations in that system. These problems are known as synchronization problems.
"99 Problems and they're all synchronization related ..."
Synchronization
Synchronization is essentially a means of controlling how concurrent processes interact. That interaction could be accessing a shared resource or simply passing data directly to one another. Think about it - if computations lived in isolation, there is no need for controlling what they interact with. They're isolated! In many cases, however, concurrent computations are often working together.
The coffee shop example I gave above presents an instance of a classic synchronization problem that Computer Scientists call the consumer-producer problem. In this example, the two employees are dealing with a shared resource which is the queue of orders - the cashier takes the order and the coffee makers makes the coffee based on the order. The first person is the producer and the second a consumer. The coffee maker shouldn't be grabbing for orders if there aren't any - we don't want the coffee maker to run back and forth even when there are no customers. That wastes energy! One synchronization mechanism we can introduce in this case is to have the coffee maker check the queue for orders. If there are no orders, then just sit still and don't do anything until notified by the cashier.
One well-known synchronization mechanism in a computing system is a semaphore which is used to control access by multiple processes to a shared resource.
Summary
In this post I've explained that the notion of concurrency deals with the structure of a system in the abstract - whether there are multiple actors in the system and what they do. One way to implement this model is by using a concurrent programming language or with non-concurrent programming languages as long as they support the basic creation of processes. Because concurrency is refers to system structure rather than process execution, the processes in the implementation may not necessarily be parallel even though the system is concurrent. Finally, when we have multiple actors in the system, we sometimes have to control how they interact (synchronization) in order to accomplish the goal effectively.
No comments:
Post a Comment