Secure Programming HOWTO

David A. Wheeler

v3.71, 2015-09-01

This book provides a set of design and implementation guidelines for writing secure programs. Such programs include application programs used as viewers of remote data, web applications (including CGI scripts), network servers, and setuid/setgid programs. Specific guidelines for C, C++, Java, Perl, PHP, Python, Tcl, and Ada95 are included. It especially covers Linux and Unix based systems, but much of its material applies to any system. For a current version of the book, see http://www.dwheeler.com/secure-programs


Table of Contents
1. Introduction
2. Background
2.1. History of Unix, Linux, and Open Source / Free Software
2.1.1. Unix
2.1.2. Free Software Foundation
2.1.3. Linux
2.1.4. Open Source / Free Software
2.1.5. Comparing Linux and Unix
2.2. Security Principles
2.3. Why do Programmers Write Insecure Code?
2.4. Is Open Source Good for Security?
2.4.1. View of Various Experts
2.4.2. Why Closing the Source Doesn’t Halt Attacks
2.4.3. Why Keeping Vulnerabilities Secret Doesn’t Make Them Go Away
2.4.4. How OSS/FS Counters Trojan Horses
2.4.5. Other Advantages
2.4.6. Bottom Line
2.5. Types of Secure Programs
2.6. Paranoia is a Virtue
2.7. Why Did I Write This Document?
2.8. Sources of Design and Implementation Guidelines
2.9. Other Sources of Security Information
2.10. Document Conventions
3. Summary of Linux and Unix Security Features
3.1. Processes
3.1.1. Process Attributes
3.1.2. POSIX Capabilities
3.1.3. Process Creation and Manipulation
3.2. Files
3.2.1. Filesystem Object Attributes
3.2.2. POSIX Access Control Lists (ACLs)
3.2.2.1. History of POSIX Access Control Lists (ACLs)
3.2.2.2. Data used in POSIX Access Control Lists (ACLs)
3.2.3. Creation Time Initial Values
3.2.4. Changing Access Control Attributes
3.2.5. Using Access Control Attributes
3.2.6. Filesystem Hierarchy
3.3. System V IPC
3.4. Sockets and Network Connections
3.5. Signals
3.6. Quotas and Limits
3.7. Dynamically Linked Libraries
3.8. Audit
3.9. PAM
3.10. Specialized Security Extensions for Unix-like Systems
4. Security Requirements
4.1. Common Criteria Introduction
4.2. Security Environment and Objectives
4.3. Security Functionality Requirements
4.4. Security Assurance Measure Requirements
5. Validate All Input
5.1. Basics of input validation
5.2. Input Validation Tools including Regular Expressions
5.2.1. Introduction to regular expressions
5.2.2. Using regular expressions for input validation
5.2.3. Regular expression denial of service (reDOS) attacks
5.3. Command line
5.4. Environment Variables
5.4.1. Some Environment Variables are Dangerous
5.4.2. Environment Variable Storage Format is Dangerous
5.4.3. The Solution - Extract and Erase
5.4.4. Don’t Let Users Set Their Own Environment Variables
5.5. File Descriptors
5.6. File Names
5.7. File Contents
5.8. Web-Based Application Inputs (Especially CGI Scripts)
5.9. Other Inputs
5.10. Human Language (Locale) Selection
5.10.1. How Locales are Selected
5.10.2. Locale Support Mechanisms
5.10.3. Legal Values
5.10.4. Bottom Line
5.11. Character Encoding
5.11.1. Introduction to Character Encoding
5.11.2. Introduction to UTF-8
5.11.3. UTF-8 Security Issues
5.11.4. UTF-8 Legal Values
5.11.5. UTF-8 Related Issues
5.12. Prevent Cross-site Malicious Content on Input
5.13. Filter HTML/URIs That May Be Re-presented
5.13.1. Remove or Forbid Some HTML Data
5.13.2. Encoding HTML Data
5.13.3. Validating HTML Data
5.13.4. Validating Hypertext Links (URIs/URLs)
5.13.5. Other HTML tags
5.13.6. Related Issues
5.14. Forbid HTTP GET To Perform Non-Queries
5.15. Counter SPAM
5.16. Limit Valid Input Time and Load Level
6. Restrict Operations to Buffer Bounds (Avoid Buffer Overflow)
6.1. Dangers in C/C++
6.2. Library Solutions in C/C++
6.2.1. Standard C Library Solution
6.2.2. Static and Dynamically Allocated Buffers
6.2.3. strlcpy and strlcat
6.2.4. asprintf and vasprintf
6.2.5. libmib
6.2.6. Safestr library (Messier and Viega)
6.2.7. C++ std::string class
6.2.8. Libsafe
6.2.9. Other Libraries
6.3. Compilation Solutions in C/C++
6.4. Other Languages
7. Design Your Program for Security
7.1. Follow Good Security Design Principles
7.2. Secure the Interface
7.3. Separate Data and Control
7.4. Minimize Privileges
7.4.1. Minimize the Privileges Granted
7.4.2. Minimize the Time the Privilege Can Be Used
7.4.3. Minimize the Time the Privilege is Active
7.4.4. Minimize the Modules Granted the Privilege
7.4.5. Consider Using FSUID To Limit Privileges
7.4.6. Consider Using Chroot to Minimize Available Files
7.4.7. Consider Minimizing the Accessible Data
7.4.8. Consider Minimizing the Resources Available
7.5. Minimize the Functionality of a Component
7.6. Avoid Creating Setuid/Setgid Scripts
7.7. Configure Safely and Use Safe Defaults
7.8. Load Initialization Values Safely
7.9. Minimize the Accessible Data
7.10. Fail Safe
7.11. Avoid Race Conditions
7.11.1. Sequencing (Non-Atomic) Problems
7.11.1.1. Atomic Actions in the Filesystem
7.11.1.2. Temporary Files
7.11.2. Locking
7.11.2.1. Using Files as Locks
7.11.2.2. Other Approaches to Locking
7.12. Trust Only Trustworthy Channels
7.13. Set up a Trusted Path
7.14. Use Internal Consistency-Checking Code
7.15. Self-limit Resources
7.16. Prevent Cross-Site (XSS) Malicious Content
7.16.1. Explanation of the Problem
7.16.2. Solutions to Cross-Site Malicious Content
7.16.2.1. Identifying Special Characters
7.16.2.2. Filtering
7.16.2.3. Encoding (Quoting)
7.17. Foil Semantic Attacks
7.18. Be Careful with Data Types
7.19. Avoid Algorithmic Complexity Attacks
8. Carefully Call Out to Other Resources
8.1. Call Only Safe Library Routines
8.2. Limit Call-outs to Valid Values
8.3. Handle Metacharacters
8.3.1. SQL injection
8.3.2. Shell injection
8.3.3. Problematic pathnames and filenames
8.3.4. Other injection issues
8.4. Call Only Interfaces Intended for Programmers
8.5. Check All System Call Returns
8.6. Avoid Using vfork(2)
8.7. Counter Web Bugs When Retrieving Embedded Content
8.8. Hide Sensitive Information
9. Send Information Back Judiciously
9.1. Minimize Feedback
9.2. Don’t Include Comments
9.3. Handle Full/Unresponsive Output
9.4. Control Data Formatting (Format Strings)
9.5. Control Character Encoding in Output
9.6. Prevent Include/Configuration File Access
10. Language-Specific Issues
10.1. C/C++
10.2. Perl
10.3. Python
10.4. Shell Scripting Languages (sh and csh Derivatives)
10.5. Ada
10.6. Java
10.7. Tcl
10.8. PHP
11. Special Topics
11.1. Passwords
11.2. Authenticating on the Web
11.2.1. Authenticating on the Web: Logging In
11.2.2. Authenticating on the Web: Subsequent Actions
11.2.3. Authenticating on the Web: Logging Out
11.3. Random Numbers
11.4. Specially Protect Secrets (Passwords and Keys) in User Memory
11.5. Cryptographic Algorithms and Protocols
11.5.1. Cryptographic Protocols
11.5.2. Symmetric Key Encryption Algorithms
11.5.3. Public Key Algorithms
11.5.4. Cryptographic Hash Algorithms
11.5.5. Integrity Checking
11.5.6. Randomized Message Authentication Mode (RMAC)
11.5.7. Other Cryptographic Issues
11.6. Using PAM
11.7. Tools
11.8. Windows CE
11.9. Write Audit Records
11.10. Physical Emissions
11.11. Miscellaneous
12. Conclusion
13. Bibliography
A. History
B. Acknowledgements
C. About the Documentation License
D. GNU Free Documentation License
E. Endorsements
F. About the Author
Index
List of Tables
3-1. POSIX ACL Entry Types
5-1. Legal UTF-8 Sequences
List of Figures
1-1. Abstract View of a Program
6-1. A physical buffer overflow: The Montparnasse derailment of 1895