Interface between Domain layer and Persistence layer
What’s in a name?
I happened across Debasish Ghosh’s post “Inject Repositories, not DAOs in Domain Entities” and it triggered a few thoughts. In this post, he suggests that the Data Access Objects that perform database reads and writes should be wrapped in a Repository wrapper to isolate the domain layer from the details of the persistence layer. What do I think? Well, yes and no. Let’s look at this in a little more detail.
In Debasish’s example, the DAO interface looks like
public interface EmployeeDao {
List getAllEmployees();
Employee getEmployeeById(long id);
List getEmployeesByAddress(Address address);
List getEmployeesByName(String name);
// .. other methods
}
and the Repository interface is
public interface EmployeeRepository {
List getOutstationEmployees(Address address);
// .. other business contracts
}
My first reaction is, “What? The domain layer (the class, Organization, in this example) doesn’t need to access employees by name, address, id, etc.?” But maybe these access methods are included in EmployeeRepository and delegating to EmployeeDao. OK, that’s fine.
In my own work, I tend to have a DAO object (though I really like Debasish’s use of Repository for the interface name) that delegates to Reader and Writer objects that interface with the database. I’ve used this pattern so much that, after writing very similar code for a couple of clients, I reimplemented the Reader/Writer framework on my own time and released it under a BSD-like license so I could use in for subsequent clients without reimplementing yet again.
If Debasish’s DAO methods correspond to my Reader and Writer objects, then these approaches are very similar. Debasish is just trading off a wider interface for my set of little single-purpose objects at the lowest level. That, and we’ve got some slight differences in naming convention.
Then I saw the implementation of getOutstationEmployees():
public List getOutstationEmployees() {
List emps = employeeDao.getEmployeesByAddress(corporateOffice);
List allEmps = employeeDao.getAllEmployees();
return CollectionUtils.minus(allEmps, emps);
}
Ugh! I’m all for delaying optimization until it’s needed, but that doesn’t mean I’ll choose a bad algorithm until it becomes an obvious bottleneck. This code has two easily avoided problems;
- it makes two round-trips to the database to get one set of entities, and
- it retrieves more data than it needs and then throws away the entities it doesn’t want.
This can create performance and scalability issues in a hurry. It makes me wonder why people jump to this sort of implementation.
Is it because many programmers don’t feel comfortable writing SQL queries, so they stick with the language they know? In many cases, this probably hits the nail on the head. My advice to those programmers is to stretch their skills. Pick up another tool, and you won’t be reaching for that hammer when you’re really trying to drive a screw.
Is it because adding another method to the EmployeeDao, e.g., List getEmployeesAvoidingAddress(Address address)
, seems like a never-ending widening of the persistence interface to support changing business needs? If so, then maybe that’s a reason to favor the single-public-method Reader class over the method-in-a-larger-DAO-class reader.
For the record, my implementation would look something like this:
public List getOutstationEmployees() {
return employeeRepository.getEmployeesAvoidingAddress(corporateOffice);
}
public interface EmployeeRepository {
Employee getEmployeeById(long id);
List getAllEmployees();
List getEmployeesByAddress(Address address);
List getEmployeesAvoidingAddress(Address address);
List getEmployeesByName(String name);
// .. other methods required by domain layer
}
public class EmployeeDao implements EmployeeRepository {
List getAllEmployees() {
MultiEmployeeReader reader = new AllEmployeeReader();
return reader.find();
}
List getEmployeesAvoidingAddress(Address address) {
MultiEmployeeReader reader = new NotAddressEmployeeReader(address);
return reader.find();
}
// .. other methods
}
Where MultiEmployeeReader is an abstract class extending JdbcReader (see my JDBC framework, mentioned above) and the concrete Reader classes returning a list of Employee extending that. For that matter, I would name the Repository getXyz() methods as findXyz() instead, but that’s a stylistic issue. If the needs proceeded much further, I would probably introduce an EmployeeCriteria class for the selection.
Edit 3/29/2021: updated links for JDBC framework to GitHub, where the code now resides
Hi,
>> This can create performance and scalability issues in a hurry. It makes me wonder why people jump to this sort of implementation.
public List getOutstationEmployees() {
List emps = employeeDao.getEmployeesByAddress(corporateOffice);
List allEmps = employeeDao.getAllEmployees();
return CollectionUtils.minus(allEmps, emps);
}
Of course, I could use 1 highly optimized SQL query to find OutstationEmployees. What happen if the logic behind OutstandEmployees become more complex?
IMHO, data tier doesn’t scale well. It might be better if I could move some logic to application server. By applying caching on business logic method using AOP, then I don’t have to go data tier everytime when I need list of AllEmployees, OutstationEmployees,…
Thanks for the note, Lek.
If the logic behind OutstationEmployees becomes more complex, the code will have to change anyway. It doesn’t matter a lot if the code that changes is in the native language of the app or the SQL query it sends to the database. (I also don’t consider adding a WHERE clause to be “highly optimized SQL.”)
Cacheing can certainly help in some situations. With regard to collections of things, I would reserve caching for things that are needed frequently relative to the rate of change. Otherwise you’ll be invalidating your cache on a frequent basis.
If you can hold the entire contents of your database in memory, then maybe a relational database is the wrong sort of persistence for your application. I think for most business applications, at least the ones I’ve developed, holding in memory the data on all employees would be a bigger limitation than writing a SQL query.
In what ways does the “data tier not scale well?” Certainly it can handle larger amounts of data in mass storage than local memory. What issues do you have in mind?
I don’t know if you are using mock objects or not but anytime I am, I want to make my Interfaces as thin as possible. To me Interfaces are like mini frameworks APIs. The DAO component is responsible for giving you data from the persistence layer. Am I wrong?
Joe,
Yes, the DAO gives you data from the persistence layer. I’m not clear on what is your point.
Hi George,
I think Debasish’s article is a simple example of what are repositories and the difference between them with DAOs. So, performance is another concern. What needs to be discussed further is: How to apply repositories with DAOs if we want to speak in terms of Debasish’s article. Or, if we want to talk about performance aspects, it is cool to talk about how to implements repositories which need to analyze million rows table for each of its method call.
The JDBC library may be found on GitHub:
https://github.com/gdinwiddie/JdbcLib