(Video from 2009: “Null References The Billion Dollar Mistake”)
History
- 1960 First job as a programmer at a British computer manufacturer; After being there for nine months he was asked to design a new programming language.
- He found a report (23 pages): “Report on the international algorithmic language Algol 60”
- Took all ideas from that report, but left out all complicated things like IF and THEN
- But finally suggested to implement Algol 60 in their systems.
Programs in the days were written in machine code. The positive things was, that you could exactly analyze and diagnose, why things go wrong.
Designing the language
Principle 1:
The result of a programmer’s mistake in the code should be predictable by looking at the hig-level code alone. The programmer should never have to delve into the hexadecimal core dumps in order to find out what had gone wrong.
But: Expensive consequence: Whenever the program contained a reference to an array with a subscript the implementation had to insert a check that the subscript was within the subscript bounds.
Costs of the test:
Computational power was 2 kilo operations per second (500 ms per operation) – and a test was two operations. A test added two / three instructions to the length of the code. Customers accepted this.
In Java every subscript is also checked for the bounds.
Principle 2 – OOP with type checking:
He the suggests for designing an Algol successor: “Record handling” = Concept of an object, to which reference could be made through a pointer.
But pointers have there problems! Because pointers realized via integers or floats can point everywhere. And therefore updates to data of that pointer may go to unwanted locations.
To avoid pointing the pointer somewhere, it is important to declare variables in a way which tells, to what kind of thing it points to. Types!
It’s important, the programmer cannot construct a pointer to something that doesn’t exits.
Customers agreed to that – despite it requires additional compiling time.
But: FORTRAN developers didn’t accept it: The company built a Fortran-to-Algol60 translator, to be able to compile them. But it was a disaster: It wasn’t accepted by the Fortran programmers, because they couldn’t run any of their programs. The compiler showed all subscript errors. They just wanted the programs to run.
-> Learning: Don’t try to convince everybody!
Invention of the null pointer:
Then he invented the null-pointer to be a possible value of every reference variable.
When you have null pointers, you must always check for null! – Run it fast without checking or run it slow with checking.
Why did he do null? Because it was an easy way to initialize values of all kinds to:
Assigning values to a default value – like 0 or “” – is possible in objects in a tree structure, but not so easy in a cyclic structure. So he took the easier way and voted for null.
Language designers should take responsibility for the mistakes of the users make in the use of our programming language! -> Avoidance of error as design principle.
FAQ Question:
Aren’t pointers better in comparison to objects? E.g. in C# everything is an object – even if it isn’t (strings etc.)
Hoare: Pointers are some kind of a jump in the code. They are useful. But by looking at them, you can’t see, what they are doing.
There are other jumps like loops, conditionals (if..then), exceptions, coroutines etc. These are explicit. Possible proof rules are simple and well structured.
Pointers are mostly used to implement an indie rubber memory. Rest of the answer unclear 🙁
(dash: Default Initialization of simple types and objects!)
ChatGPT’s thoughts about programming language without null values:
“When a programming language does not allow nullable variables, it means that variables in the language cannot hold a null
value. This design choice can have several implications, both positive and negative, on the development process and the language’s usage. Here are some possible implications:
Positive Implications
Increased Type Safety:
- Eliminates Null Reference Errors: Null reference errors (e.g., the infamous “null pointer exception”) are a common source of runtime errors. By disallowing nulls, the language eliminates an entire category of bugs.
- Compiler Enforcement: The compiler can enforce that variables are always initialized to valid values, reducing the risk of uninitialized variables causing runtime issues.
Simplified Code:
- No Null Checks Needed: Developers do not need to write code to check for null values, leading to cleaner and more readable code.
- Clearer Intent: Since variables cannot be null, the intent of the code is clearer. Every variable is guaranteed to have a meaningful value.
Enhanced Reliability:
- Predictable Behavior: Programs are more predictable since variables will always hold valid data, reducing unexpected behaviors due to nulls.
- Improved Debugging: Debugging is simpler because developers do not have to trace null references, making it easier to identify the cause of errors.
Negative Implications
Handling Absence of Values:
- Alternative Representations: Developers need alternative ways to represent the absence of a value. This might involve using special values (e.g., empty strings, zero, or sentinel values), which can be less expressive or lead to misuse.
- Complex Workarounds: The absence of nullable types may require more complex workarounds, such as using optional types, wrappers, or special constructs to handle cases where a value may or may not be present.
Interoperability Challenges:
- Interfacing with Other Systems: Interfacing with systems or languages that allow null values can be challenging. Developers may need to write additional code to handle or convert nulls when interacting with external libraries, APIs, or data sources.
- Data Exchange: Working with databases or data formats that support null values can require additional handling, such as mapping nulls to alternative representations.
- Initial Learning Curve – Adaptation for Developers: Developers who are used to working with nullable types may need to adapt to the non-nullable paradigm, which could involve a learning curve and changes in programming habits.
Language Examples
Kotlin: Kotlin has non-nullable types by default but provides nullable types explicitly marked with a ?
. This strikes a balance by allowing nullable types where necessary while encouraging non-nullable usage.
Rust: Rust uses the Option
type to handle cases where a value might be absent, rather than allowing nulls. This enforces explicit handling of the absence of values, improving safety.
Swift: Swift also uses optional types (Optional<T>
) to handle nullability, encouraging developers to think explicitly about when a value can be absent.”
Links: