From number of years, microprocessor designers have been focusing on increasing core count rather than individual clock rate, although we are still seeing Moore's Law. As multi-core processors become more prevalent, programs need to be multi-threaded to take advantage of the parallel processing capabilities on offer. But concurrent programming is hard. Damn hard. One solution to this is to use functional languages like Erlang, Haskell or F# to make concurrent programming easier for the developer since they often employ immutable "variables" which makes reasoning about state management significantly easier (read: there are no side effects). However old-school folks who have to use their current language (C++/Java/C#) can use threads in order to achieve concurrency, as long as they also employ various constructs/patterns to manage synchronisation of threads.
It might seem obvious but you need to remember that concurrency is for performance, and there are generally 3 ways in which concurrent execution can improve performance:
- to reduce latency (making the operation execute faster);
- to hide latency (avoiding blocking other operations whilst slow operations execute); or
- to increase throughput (making the system perform more work per unit time).
Reducing latency requires that the unit of work under investigation be "parallelizable". That is, you need to find an algorithm that lets you divide the work into chunks that can execute in parallel, and also a way of putting the chunks back together to complete the work. As well, when you divide the work up into smaller chunks you need to ensure that the cost of communication between the processing sub-components, and re-assembly of any interim processing doesn't outweigh the benefit of having parallel operations. The same problems are faced by proponents of grid/cloud computing operations.
An example of using concurrency to reduce latency is a database engine servicing a query. At a certain point it knows what data pages it needs and it know which of these are already in memory and which are not - in which case we need to fetch them from disk. The database engine can spawn multiple threads to read different data pages off disk into memory. This parallel loading reduces the time taken by the query.
Obviously, the degree to which one can use concurrency to reduce latency is dependent upon how "parallelizable" the problem is.
If the operation is long-running and not amenable to parallelization then concurrent execution can instead be used to go ahead with other tasks while the long-running task is done on another thread. Here we aren't reducing the latency of the task we are just hiding it.
This sort of concurrency is often done in UI development where we wish to improve the perceived responsiveness of the UI so we can employ asynchronous background threads to perform certain operations.
Here, instead of using parallel operations to make a single operation faster (case 1 above) we are employing multiple concurrent operations to get more operations processed per unit time. For example say we have a service/daemon running on a server that watches the file system and when it sees an mp3 file in a certain directory it fetches it and does some transcoding operations. If the server is a multi-core machine it's beneficial to have to daemon use a pool of X threads where each thread is given a single file to process. This increases the throughput of the daemon by virtue of concurrent operations.
In a future post I'll dive into details of the some locking constructs and also some of the patterns that make concurrent programming easier.