I want to know the main reasons for making variables and functions in a class private. How is it better? What can happen if you don't do it?
Here are a few possible reasons that I can think of that someone could have:
* You have been taught to do it so you just do it without thinking.
* It reduces the number of files you have to search in if you want to find all uses of a member.
* The member is hard to understand so you want to discourage people from using it.
To clarify, I'm only talking about code in the same project that everyone has access to. I'm not talking about defining an API for other people to use that don't have access to the code, like when you make a library.
Then one day you decide you want to aggressively refactor that class. Now you can because the interface can stay the same even though you could completely re-do the guts. Better still: all your tests will still work, as will your class documentation.
Ouch, that's a pretty careless reason to start with. Why does insulting your audience seem like the right first move?
In much of the classist (classy? You decide) code I've written, the entire reason that I've chosen to encapsulate the data into a class is that maintaining a consistent data structure requires book-keeping. The fields and methods associated with this bookkeeping are not part of the public interface, because there should never be a reason for consumers to handle that stuff directly.
In short: even in a project that "everybody has access to," you should still implement clean APIs. If you let everybody put hooks into everything anywhere, you get horrible hacky spaghetti code.
Edit: and just FYI, "private" does not mean "secure" or "secret" or "untouchable." A dedicated user can gain access to, and modify, everything.
Within reason, you should treat every class as a tiny library that defines an API for other people to use.
That allows you freedom to change things within the class without breaking other code. It also makes it easier to prevent accidental misuse of the class that can leave it in weird states that aren't supposed to happen. It allows you to reason about the intended use cases of the class and write tests for them all.
Let's say I have a Motorcycle class with 2 Wheel members. What if in the code I assume 2 Wheels when writing the serializer? And then someone comes along, instantiates a Motorcycle and nukes a Wheel? The code will break.
You can make an argument that OOP is not what you want, and instead create something akin to namespaces+data, and that's fine. But OOP's thing is to create a public API, and controlled state behind it.
it does not matter what you promise in the contract:
all observable behaviors of your system
will be depended on by somebody."
I might want to have a method called SetPrice(APIPrice price) and there I do a check to see if it's a valid value as far as my code is concerned. I had a situation like this handling pricing information taken from a variety of indices but I want a calculation to still work out to something usable. Also, most business users don't understand what NULL * 3.5 would mean, but they would understand 0.0 * 3.5.
That's the main one off the top of my head.
- Steering wheel
- Transmission
- Brakes and Gas pedal
- Doors and Windows, etc.
The following is a private API for a Car:
- The inner workings of the engine
- The inner workings of the battery
- Smart break system, internal wiring, etc.
All users of the Car only care about how to drive it with as little knowledge as possible.
Now Mazda can do a full recall and upgrade the smart breaking systems internally but millions of drivers around the world don't have to "learn" anything new, they can happily continue using the public API.
It makes refactoring much easier, reduces dependency and enables duct typing (in supported languages) via message passing for objects with a common public API. You also don't have to perform shotgun surgery[0] when changing classes..
If you can see the use in access modifiers for libraries then it's not so hard to make the leap to its general usefulness. Defining objects with clear and simple APIs is extremely valuable when working on complex software projects. As the code base inevitably grows to a point where no one knows every part of it, you want developers to be able to contribute without having intimate knowledge of everything. The value proposition is the same as for third-party libraries.
Coming from Fred Brooks, this was an enormous admission! What is this amazing "information hiding?" It is separating interface from implementation, leaving one public and the other private. It is accomplished, among other means, by marking things "private."
This is the one thing in 20 years that Brooks found to be valuable in increasing programmer productivity in terms of essential complexity. Marking implementation details as private reduces the essential complexity of the code.
By the way, in some of your comments you talk about "everyone" having access to things, as if that makes a difference. Marking things as "private" is not about closing off access to particular people. It's not a security measure. It's about reducing the complexity of a given class by ensuring that the number of access points to that class are limited. That could affect other teams, sure, but it could also affect future-you, or you-working-on-other-class.
The only time I break this rule is for data-transport type classes. For example if I serialize some object into some class X before sending it to some API. Since all those objects do is hold data, I see no point in making the members private. Many people do, I suppose out of habit; or in anticipation that maybe one day those objects will do more than hold data. I'd like to hear a good argument for making members private in data-transport type classes.
A lot of people are ignoring this part of the question. The primary usefulness of object-oriented programming is scope control. Dependencies kill teams, and scope control helps them survive. The more I limit the scope of something, the easier for someone else to understand it and make necessary changes.
Imagine a language without scope control: When people try to understand how something works and what might break it, they have to consider all possible points of entry, which becomes every variable & function in your class.
Java programmers often pitch fits about "getters" and "setters", insisting that every variable be private and guarded by a public method. In your context, you might find this to be kind of silly since we're mostly just moving scope control around instead of limiting it, but good luck fighting that battle (I'm not going to try). At any rate, I still would not use this kind of annoyance as an excuse to wave off scope control as nothing but bureaucratic boilerplate. It has real value when used pragmatically.
The set of property values of an object is its state, and you can model every possible valid state and state transition. Not every state is valid. If you allow all variables to be touched willy-nilly, you allow an object to be put into an invalid state.
The only variables and functions that should be public are those that cannot put an object into an invalid state.
The practical reason is that this way you have limited paths (methods) with which the inner workings of a class can be modified with, whereas if anything would be public, any other obejct would be able to change a class' status rendering debugging incredibly more complicaed.
This is also a part of the Open-Closed principle: "software entities (classes, modules, functions, etc.) should be open for extension, but closed for modification" https://en.wikipedia.org/wiki/Open%E2%80%93closed_principle
Even within a single module of code, your public interfaces are "APIs". If I write a class that displays a paged interface in a popup modal, there is no relevant difference between a member of my team utilizing that class (from within the same module) and other people using it (as a library I have published). They both read the documenting comments and access the public members to achieve their goals, and both will suffer from the same confusion if there is no way to tell which members are the interface and which members are implementation details.
Anything I do inside that class that isn't marked public is an implementation detail, and might change. Given time, likely will change! But if it's not marked public, then if you want to access something in there, you'll need to think through the implications of making it public, either a new method of by adding to an existing method.
If I come along later and want to change things about those methods, I can do that freely, so long as I maintain the same input types and return types.
Internally I can change a counter from an Int to an Long, and it doesn't matter to anybody. Internally I can change the way I'm assembling a String, and that's fine. Internally I can move the definition for an interim variable to a separate function with its own tests, and that's fine. So long as I maintain the input types and return types, all is good.
If the fields and methods were public, on the other hand, then other classes might be accessing them directly, and now any change has to be negotiated with all callers. The counter needs to be changed from an Int to a Long? If it's public, you'll need to make sure that no other classes anywhere are referring to that counter and expecting it to be an Int.
Side effects are generally unwanted, and marking things as public is making everything in an implementation a side effect.
If the field is mutable then obviously you may want to keep it private to ensure some invariant, or to cache/invalidate/log something when accessed. Accessing the private data directly will circumvent that.
If the exposed data isn’t the same as the stored data (e.g a DateTime struct might have a field with a 32 bit Unix timestamp but only exposes year/month/day/hour…). The reason for not exposing it because you don’t want consumers to depend on it. If you changed to a 64 bit internal representation you will break any consumer who is using the private field. But if you only expose year/month/day/hour/… then you can solve your year-2038-problem by simply changing the data type. Anything exposed is depended on. Always. And here is the kicker: it doesn’t matter if your consumers are other classes in the same program, or library users for a shipped api, they are users and they will (mis)use your class in any way possible. In the library api case you break others’ code. In the internal code case you might get a missed cache, or a harder refactor or similar.
It's for the same reasons. When you write code you are always writing code that some other human will consume through an interface of sorts. Even on small three person projects two truths hold:
- Your project mates are other humans.
- Your future self is an other human.
By defining members as private you're announcing in code to other humans that the private members are important only to the internal implementation of the class and should not be used elsewhere. And yes, your future self with thank you for the distinction.
Now, you don't need to have private methods. In some languages, a work around is to label the methods as "private" by adding a prefix of some sort like an underscore.
Another more functional reason has to do with inheritance. Private methods have different behaviors in different languages when inherited.
For personal projects I do use public things where necessary, and for team programming I also use public functions, but not public variables (sonar doesn't allow it) unless you make them final, which I do.
The important part is "when necessary". Think that each class is like a small little library (it should be at least). There are things you want others to know and use, and other internal things you don't. Public for the first, private for the second.
You sound like a junior developer looking for guidance.
There will always be some amount of coupling, but reducing it makes it much easier to reason in your mind about small sections of code. With good encapsulation, there are clear interfaces between components and it is difficult to use them incorrectly.
You could argue that if there are only a few or even one developer on a project, there is no harm in making everything public since everyone will just use the class "correctly". However, the "correct" way to use the class is not codified or enforced anywhere other than the minds of the developers, comments, or external documentation. All of these sources can easily fall out of sync with the actual code being written. The compiler/runtime should be used to enforce correct usage when possible. This greatly reduces the cognitive load on the developer/s since they know if the code compiles or runs without error, a whole class of bugs has already been eliminated.
I would argue that even if access controls were completely ineffective (they didn’t actually control access), they would still be useful as API documentation within the source code to point others and your future self to the set of variables and functions that should be used to interact with the class. There is a benefit to writing code that is itself expressive of intent without requiring additional documentation or knowledge.
Another reason to reduce coupling is to make it easier to re-implement a single component of the system. If the component was well encapsulated and has a clean and minimal public interface, the only restriction on the new implementation is to meet that same public interface. If the component was not well encapsulated, the new implementation may have to maintain a number of details from the previous implementation than no longer make sense just to maintain all of the unnecessary coupling that’s been created between the component and the code that uses it.
This applies to classes without external users, too, the same divisions between class responsibilities and implementation remains valod
A low surface area API means clean, elegant code that is simpler to use and understand.
>To clarify, I'm only talking about code in the same project that everyone has access to. I'm not talking about defining an API for other people to use that don't have access to the code, like when you make a library.
The concept of API still applies to same codebase modules. Ditching it is the shortest path to spaghetti code.
I do that so that "future me" has as little to understand as possible. ("reduce the number of files to search" seems silly in a world with filename wildcards and find. Are "modern" tools less capable?)
Changing private things and nobody might ever know.
It's about the need to coordinate and communicate changes and avoid people relying on unspecified behavior.
It says "don't call me". That might be because it might be refactored without notice, or might break the intended use of the Class.
Even in languages without formal class access control, there is convention for labelling something as externally unsafe. It's an important part of sharing code.
I don't care and I shouldn't care about the dependencies or statefulness. I'm interested in the before-and-after of using the class methods.
It lowers cognitive load because it provides demarcation of inside vs outside.
I used to believe in information hiding, but not anymore.
Then, you will probably divide up the code into a number of separate methods for better structured code. All the methods that are purely internal and not part of the public API should be private if only to enforce the public API, this is also the encapsulation principle.
Should the implementation of a module not be able to utilize functions internally?