Repository or Collection in Magento 2: when should you use them?

When handling entities in Magento 2, there are two main approaches available: using a Repository or interacting with a Collection.

These two methods have very distinct uses. It is therefore essential to understand their respective objectives in order to avoid bad practices.


1. When handling a Magento 2 object for the first time

When developing certain features in a Magento 2 module, it is often necessary to interact with the database. To do this, Magento offers various tools for retrieving, modifying, adding, or deleting data. These include:

  • The repository, which provides methods for retrieving an instance of an object.
  • The collection, which allows access to objects according to specific criteria.

The line between these two approaches can sometimes seem blurred, as they can both lead to similar results. However, their use must be defined according to the purpose of the data recovery.


2. The Repository: for single use

The repository is a Magento 2 standard that provides several methods for retrieving, saving, or deleting an entity. For example, for a product, the following methods are available:

  • get($sku, $editMode = false, $storeId = null, $forceReload = false) : allows you to retrieve a product from its SKU.
  • getById($productId, $editMode = false, $storeId = null, $forceReload = false) : allows you to retrieve a product from its ID.

It also offers the possibility of retrieving a list of entities via the getList() method, which takes an object of type SearchCriteriaInterface as a parameter. This allows filters, sorting and pagination to be applied.

Although very practical, repositories sometimes suffer from a lack of consistency in their structure. They can be found either in the Model folder (which would be the most logical place) or in ResourceModel.

In all cases, a repository relies on a ResourceModel to perform database operations. It is a cleaner overlay for the old $model->load($id) method, which is now deprecated.

Repositories are generally coupled with an interface, which makes it easy to expose them via Magento APIs.

These methods return a complete object, which contains not only the main data for the entity (e.g. the product), but also data from associated tables (such as categories or custom attributes).

This results in objects that are rich but potentially cumbersome, especially if only a small amount of data is required.


3. The Collection: for mass, optimised access

The collection, meanwhile, allows for more direct interaction with the database. It enables records to be retrieved according to specific criteria, in the same way as SQL queries.

For example:

$productCollection->addAttributeToFilter($attribute, $condition, $joinType);

This allows you to retrieve multiple records, or just one, depending on your needs (for example, via the SKU or ID).

This method is lighter because it only loads the necessary data. You can also perform joins, as in the following example:

$productCollection->joinTable(
    ['ccp' => 'catalog_category_product'],
    'product_id = entity_id',
    ['product_category_ids' => 'category_id']
)->addAttributeToFilter('product_category_ids', ['in' => $categories]);

4. Why not always use the repository, even for a single object?

In the case of complex objects such as products, using a repository can be counterproductive if only a few fields are needed (e.g., price). This generates many unnecessary queries due to automatic joins.

In this case, using a collection is preferable: it limits queries and allows for much better optimisation based on the data that is actually needed.

However, the repository remains indispensable in certain situations, particularly when a large amount of information needs to be retrieved about an entity.


In summary

In general, it is recommended to prioritise the use of collections due to their flexibility and performance. Repositories should be reserved for cases where their use is truly necessary, particularly when there is no other option.