But If I started again, these are the 7 concepts I would focus on learning first.
Availability
Is the percentage of time the system is up in a defined timeframe.
99.999% -> 864.00 milliseconds downtime per day. Do these numbers sound familiar?
This is a simple translation from percentage to time.
Scalability
The ability of a System to handle increasing amounts of traffic without sacrificing performance.
Two ways to scale:
- Vertical
- Horizontal
A scalable system can easily accommodate more users or data without becoming overloaded.
Cache
Caching temporary data in high-speed data storage reduces access time.
- When to cache
- What to cache
- For how long should it be cached?
Resiliency
The ability of a distributed system to recover from failures or disruptions.
The key strategies:
- Redundancy
- Fault tolerance
- Monitoring and testing
- Disaster recovery
Data Management
Techniques used to manage data in a distributed system:
- Replication
- Sharding
- Partitioning
You should store and manage Data efficiently, reliably, and securely.
Performance
The speed and efficiency of a distributed system:
- Response time
- Throughput
- Latency
A high-performing system can handle large volumes of data or traffic quickly and efficiently.
Security
How secure is your distributed system, including:
- Authentication
- Encryption
- Access control
- Auditing
Data breaches are not found to deal with.
What else would you add?