Should the Database and Application projects be in the same Repository?

Following the rise in popularity of DevOps for Databases, many interesting questions are being asked on the topic.

One of these questions is: Should your SQL Database project be in the same source control repository and solution as the App code project? Or maybe they should be in the same repository but separate solutions? Or maybe they should be in completely separate repositories?

Repository? Branch? Folder?

For the purpose of this article, I will be assuming that the source control type we’re dealing with is Git, and not TFS or TFS-equivalent. This is because in TFS there isn’t much difference between placing stuff in two separate repositories, versus simply placing them in different folders/branches within the same repository.

In Git, however, the differences are much more pronounced, so this decision becomes more critical.

It is, however, possible to implement something like TFS folders in Git as well, by creating “disconnected” branches (i.e. more than one “head”/”root” branch) – but this often causes confusion and various technical problems with the branches, and is therefore not recommended.

So anyways, to try and answer our question, let’s try to understand – what do we get from having some things within the same source control repository? In other words: What do we get at the repository level, which cannot be separated between different projects/solutions inside the same repository? And when is it really a requirement to have two different projects in the same solution?

Permissions

In every Git source control platform, the most common “level” at which a person’s access is defined – is at the repository level.

In other words:

  • Which persons have access to the repository?
  • What kind of permissions do they have in the repository? (read / write / create pull requests / approve pull requests / create branches / push to dev / push to main / etc.)

For example, let’s say that we have 3 different “types” of persons:

  1. Mobile App Developers
  2. Data & BI Developers
  3. Team Leaders / Project Managers

And let’s say that we have 2 projects for which each development team is responsible:

  1. MobileApp1
  2. ResearchDB_DWH

Ideally, each team should have write access only to their corresponding project (I’ll let you guess which belongs to which, you hotshot).

Additionally, for the purpose of transparency, cooperation, and Agile™ principles in general, each team should also have read access to the other team’s project.

And finally, only the team leaders and project managers should have permission to approve pull requests and perform merge between branches.

These differences in permissions between these different groups of people make sense that these 2 projects should sit in their own separate 2 repositories.

On the other hand, what if there is only one development team? Or maybe the same development team is responsible for multiple different projects?

In that case, splitting these projects into different repositories wouldn’t really make a difference in terms of permissions…

So, what about other differences?

Branching and Versioning

Normally, a branching policy in Git would look something like this:

a typical GitFlow branching policy
  • The “master”/”main” branch would represent what is currently “live in production”.
  • The “dev”/”develop” branch would represent the next version currently in development at the repository level.
  • The “feature branches” would represent individual developments split from the “develop” branch and then merged back into it when completed.
  • etc.

The critical part here is: when and what do we merge from the “dev” branch into the “main” branch?

In other words: Which is the version that we will be deploying to production and when?

Let’s take for example our 2 projects:

  • MobileApp1
  • ResearchDB_DWH

And let’s say that at a certain point, the Mobile Development team decided that they’re ready to publish their version currently in development to production. i.e. merge “dev” into “main” and deploy it.

But what if the Data & BI development team has their own changes in the same “dev” branch, but their version is not yet ready for production?

This is obviously a problem.

We wouldn’t want a partial version of something to accidentally be deployed to production.

If not everything in the “dev” branch is ready for production, then it should NOT be merged into “main”, or… not all of it should be inside the same “dev” branch in the first place.

The AppDev team wants to publish a release… But the Data&BI team has only partial changes in the same branch!

On the other hand, if both development teams are fully aligned in terms of versioning and readiness for production, then there’s no problem having their projects in the same branches – and thus – in the same repository.

CI/CD Pipelines

When a CI/CD pipeline is configured against a source control, it is configured to listen for changes in a specific repository and a specific branch or set of branches in that repository.

For example, a CI/CD pipeline into the test/staging environment could automatically activate whenever there’s a push or approved pull request into the “dev” branch of a certain repository.

A different pipeline, into the production environment, could automatically activate whenever there’s a push or approved pull request into the “main” branch of the same repository.

It’s also possible to predicate the activation of a pipeline based on tags that it expects to see as part of a commit.

This is an example from Azure DevOps for configuring a continuous deployment trigger

Of course, the pipeline itself could be configured to build and/or deploy a specific solution or project(s) inside the repository (while ignoring others within the same repository). But the TRIGGER for the automated activation of the pipeline has to be – at the very least – at the branch level of a specific repository (and/or specific tags).

The considerations here, therefore, are similar to the considerations about “branching and versioning” as explained in the previous section:

Only if all the projects in the repository are to be deployed together by the same CI/CD pipeline, then they should be in the same repository together. Otherwise, it’s a waste of automated runs.

Inter-Project Coupling and Dependencies

It often makes sense that there would be dependencies between a database project and the application that it’s supposed to be serving.

Oftentimes a change in the app requires that a corresponding change must be done at the database first in order to work (and sometimes even vice versa).

When there is such a strong coupling between an app project and a database project, it would make sense to not only place them within the same repository together but even in the same solution together. As they need to be deployed together and even built together (references within Visual Studio may need to be defined, otherwise the project will not build successfully).

However, this consideration is often rare and minuscule in comparison to the other considerations mentioned previously.

Even if separated projects are dependent on each other, their delivery to production can still be planned ahead and scheduled in such a way so that the component that needs to be changed first, will be deployed first.

Having them in separate repositories is not actually a blocker for such cases, as long as they can be built separately.

Usually, this would not really be an issue. For example, one project could have a dependency on the artifact/assembly/dacpac of another project instead of on the project itself, and so it becomes easier to manage their versioning separately without breaking the build.

What about inter-project dependencies during deployment?

Well, a common methodology for this would be to first deploy one of the projects with backward compatibility, so that it can serve both the old and the new versions of the other project(s) that depend on it.

Alternatively, if this is not possible, then another methodology is to initiate a “system-wide downtime“. Where everything is taken offline until all necessary projects are successfully deployed. You should obviously try to prevent such cases as best you can, by properly planning your changes and deployments in advance.

Conclusion

So, there you have it.

The above are the essential considerations you should be having when asking whether the Database and Application should be in the same repository or not.

As you can probably tell, the common answer is no, they should not be in the same repository. Unless:

  • They have the same permission and access needs
  • They are fully aligned in terms of versioning
  • They must always be deployed together by the CI/CD pipeline(s)
  • They cannot be built separately unless in the same solution

In which case – sure, go ahead. Let them live happily ever after together hand in hand 😊.

Would you like to know more?

Do you need to know more about implementing CI/CD practices for your SQL Database? Would you be interested in high-quality mentoring and tailor-made training for this purpose where we guide you and help you and your team at every step of the way? Then this could definitely be for you:

Do you want to learn more about CI/CD for SQL Server databases in general? I got several free webinars that may interest you:

5 comments

  1. One thing I’d add: even if the database only supports one application today, it’s just a matter of time before additional applications point at the same database.

    At that point, you have multiple apps that are trying to add their own tables inside the same database, and they have some objects in common – and some isolated. If the db is in app A’s repository, but app B needs to make database changes in order to support a new version of their app, then you can be stuck in the awkward position where you *have* to deploy app A without it being ready – just because you need database changes made.

    Liked by 2 people

  2. Ideally, you wouldn’t want to rebuild a C# application just because someone is adding an index to the database it uses. However, if a column is renamed in the database, the application definitely needs to be rebuilt. Within reason, too many builds is better than not enough.

    There can also be dependencies between an OLTP database and the Synapse/ADF pipeline project that needs to do the ELT/ETL from it to a reporting database or other repository. Ideally a change to the OLTP data model wouldn’t break reports based on the repository, but that level of abstraction can be challenging to achieve.

    Like

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.