2008-05-20
On Designing and Deploying Internet-Scale Services
关键字: 设计和部署internet服务笔记
On Designing and Deploying Internet-Scale Services
Three simple tenets
1. Expect failures
2. Keep things simple
3. Automate everything
1. Overall Application Design
1. Design for failure
2. Redundancy and fault recovery
3. Commodity hardware slice
4. Single-version software
5. Multi-tenancy
6. Quick service health check
7. Develop in the full environment
8. Zero trust of underlying components
9. Do not build the same functionality in multiple components
10. One pod or cluster should not affect another pod or cluster
11. Allow rare emergency human intervention
12. Keep things simple and robust
13. Enforce admission control at all levels
14. Partition the service
15. Understand the network design
16. Analyze throughput and latency
17. Treat operations utilityies as part of the service
18. Understand access patterns
19. Version everything
20. Keep the unit/functional tests from the last release.
21. Avoid single points of failure
2. Automatic Management and Provisioning
1. Be restartable and redundant
2. Support geo-distribution
3. Automatic provisioning and installation
4. Configuration and code as a unit
5. Manage server roles or personalities rather than servers
6. Multi-system failures are common
7. Recover at the service level
8. Never rely on local storage for non-recoverable information
9. Keep deployment simple
10. Fail services regularly
3. Dependency Management
1. Expect letency
2. Isolate failures
3. Use shipping and proven components
4. Implement inter-service monitoring and alerting
5. Dependent services require the same desgin point
6. Decouple components
4. Release Cycle and Testing
1. Ship often
2. Use production data to find problems
3. Invest in engineering
4. Support version roll-back
5. Maintain forward and backward compatibility
6. Single-server deployment
7. Stress test for load
8. Perform capacity and performance testing prior to new releases
9. Build and deploy shallowly and iteratively
10. Test with real data
11. Run system-level acceptance tests
12. Test and develop in full environments
5. Hardware Selection and Standardization
1. Use only stantard SKUs
2. Purchase full racks
3. Write to a hardware abstraction
4. Abstract the network and naming
6. Operations and Capacity Planning
1. Make the development team responsible
2. Soft delete only
3. Track resource allocation
4. Make one change at a time
5. Make Everything Configurable
7. Auditing, Monitoring and Alerting
1. Instrument everything
2. Data is the most valuable asset
3. Have a customer view of service
4. Instrmentation required for production testing
5. Latencies are the toughest problem
6. Have sufficient production data
7. Configurable logging
8. Expose health information for monitoring
9. Make all reported errors actionable
10. Enable quick diagnosis of production problems
8. Graceful Degradation and Admission Control
1. Support a "big red switch"
2. Control admission
3. Meter admission
9. Customer and Press Communication Plan
10. Customer Self-Provisioning and Self-Help
Three simple tenets
1. Expect failures
2. Keep things simple
3. Automate everything
1. Overall Application Design
1. Design for failure
2. Redundancy and fault recovery
3. Commodity hardware slice
4. Single-version software
5. Multi-tenancy
6. Quick service health check
7. Develop in the full environment
8. Zero trust of underlying components
9. Do not build the same functionality in multiple components
10. One pod or cluster should not affect another pod or cluster
11. Allow rare emergency human intervention
12. Keep things simple and robust
13. Enforce admission control at all levels
14. Partition the service
15. Understand the network design
16. Analyze throughput and latency
17. Treat operations utilityies as part of the service
18. Understand access patterns
19. Version everything
20. Keep the unit/functional tests from the last release.
21. Avoid single points of failure
2. Automatic Management and Provisioning
1. Be restartable and redundant
2. Support geo-distribution
3. Automatic provisioning and installation
4. Configuration and code as a unit
5. Manage server roles or personalities rather than servers
6. Multi-system failures are common
7. Recover at the service level
8. Never rely on local storage for non-recoverable information
9. Keep deployment simple
10. Fail services regularly
3. Dependency Management
1. Expect letency
2. Isolate failures
3. Use shipping and proven components
4. Implement inter-service monitoring and alerting
5. Dependent services require the same desgin point
6. Decouple components
4. Release Cycle and Testing
1. Ship often
2. Use production data to find problems
3. Invest in engineering
4. Support version roll-back
5. Maintain forward and backward compatibility
6. Single-server deployment
7. Stress test for load
8. Perform capacity and performance testing prior to new releases
9. Build and deploy shallowly and iteratively
10. Test with real data
11. Run system-level acceptance tests
12. Test and develop in full environments
5. Hardware Selection and Standardization
1. Use only stantard SKUs
2. Purchase full racks
3. Write to a hardware abstraction
4. Abstract the network and naming
6. Operations and Capacity Planning
1. Make the development team responsible
2. Soft delete only
3. Track resource allocation
4. Make one change at a time
5. Make Everything Configurable
7. Auditing, Monitoring and Alerting
1. Instrument everything
2. Data is the most valuable asset
3. Have a customer view of service
4. Instrmentation required for production testing
5. Latencies are the toughest problem
6. Have sufficient production data
7. Configurable logging
8. Expose health information for monitoring
9. Make all reported errors actionable
10. Enable quick diagnosis of production problems
8. Graceful Degradation and Admission Control
1. Support a "big red switch"
2. Control admission
3. Meter admission
9. Customer and Press Communication Plan
10. Customer Self-Provisioning and Self-Help
- 11:50
- 浏览 (249)
- 评论 (0)
- 分类: Architecture
- 相关推荐
发表评论
- 浏览: 681896 次
- 性别:

- 来自: BJ

- 详细资料
搜索本博客
我的相册
screenshot
共 1 张
共 1 张
最近加入圈子
最新评论
-
Mnesia用户手册:三,构建 ...
要想创建disc_copies和disc_only_copies类型的表有两个前 ...
-- by hideto -
翻译www.djangobook.com之 ...
有个问题问一下: 我先配置了一个urlpatterns是这样的: r'^myd ...
-- by lyhapple -
Why OO sucks
gigix 写道lyl0035 写道为啥就没人想想,其实在面向对象的代码中也流露 ...
-- by hurd -
Why OO sucks
貌似又回到当年java vs c的年代。两种方式,不管是OO还是FP,仅是人处理 ...
-- by python -
大家可以抛弃Java踹死Djan ...
to phoenixup:1,你还别说,你举的什么Struts,Tapestry ...
-- by hideto






评论排行榜