But If I started again, these are the 7 concepts I would focus on learning first.
Availability
Is the percentage of time the system is up in a defined timeframe.
99.999% -> 864.00 milliseconds downtime per day. Do these numbers sound familiar?
This is a simple translation from percentage to time.
Scalability
The ability of a System to handle increasing amounts of traffic without sacrificing performance.
Two ways to scale:
  • Vertical
  • Horizontal
A scalable system can easily accommodate more users or data without becoming overloaded.
Cache
Caching temporary data in high-speed data storage reduces access time.
  • When to cache
  • What to cache
  • For how long should it be cached?
Resiliency
The ability of a distributed system to recover from failures or disruptions.
The key strategies:
  • Redundancy
  • Fault tolerance
  • Monitoring and testing
  • Disaster recovery
Data Management
Techniques used to manage data in a distributed system:
  • Replication
  • Sharding
  • Partitioning
You should store and manage Data efficiently, reliably, and securely.
Performance
The speed and efficiency of a distributed system:
  • Response time
  • Throughput
  • Latency
A high-performing system can handle large volumes of data or traffic quickly and efficiently.
Security
How secure is your distributed system, including:
  • Authentication
  • Encryption
  • Access control
  • Auditing
Data breaches are not found to deal with.
What else would you add?