My First Data Warehouse: Intro

DEAR READER: I want to make you aware this blog series is intended to share my journey mistakes and all, to creating my first data warehouse.

I have been wanting to create a data warehouse at my company for several years. I have made progress here and there. Some table design, some SSIS packages but I’ve never been able to get a data warehouse setup in Production. I had failed to complete the most important step; I didn’t have an internal advocate. This time was different! We had hired someone new at the company that was used to performing data analytics and creating visualization.

We needed to analyze both data that did and did not exist in our systems. When I first started this project, the goal was to get data imported in the shortest amount time possible. Due to the time constraint, I was asked to import the data without any transformations. For the readers that have been on a journey like this before, I think you’ll know where this story is headed.

Even though the data was in the database, it wasn’t easy to analyze. It was at this point that I realized I made a couple of beginner mistakes. There were some key design elements that needed to be completely redone. Over the next couple of weeks, we’ll go over what I learned and how I fixed things.

Multiple Database Projects in Azure Data Studio

In many companies, you’re often working with more than one database on the same server. When I use Visual Studio, I have all the databases that exist on the same server in the same solution. You can do this same thing when you’re working with Azure Data Studio.

In this week’s YouTube series, I show you one trick that will let you have multiple database projects in the same workspace. This becomes especially important if you have database code that references objects in a separate database.

Creating a Publish Profile in Visual Studio

One of the factors that allowed my company to get comfortable automating database deployments was SQL Server Data Tools (SSDT) and publish profiles. We started using SSDT with database projects before Azure Data Studio (ADS) existed.

One of our fears was always how to prevent losing data and critical data code. Here were publish profiles to our rescue. We also found that some of our database code had specific values depending on the environment or contained references to other databases. Once again, publish could solve these problems!

While I’d love to say that you could use ADS to manage your database projects, that just isn’t true right now. However, we have a way to help you get a publish profile created. If you don’t want to use Visual Studio yourself, you might want to ask your Developer friends real nice and see if they’d be willing to help you out.

I’ve create a nice YouTube video to show you how you can create your own publish profile. Check it out!

And if you’re curious about what a publish profile might look like, here’s an example of the one created in the video.

<?xml version="1.0" encoding="utf-8"?>
<Project ToolsVersion="Current" xmlns="http://schemas.microsoft.com/developer/msbuild/2003">
  <PropertyGroup>
    <IncludeCompositeObjects>True</IncludeCompositeObjects>
    <TargetDatabaseName>BookCatalog</TargetDatabaseName>
    <DeployScriptFileName>BookCatalog.sql</DeployScriptFileName>
    <TargetConnectionString>Data Source=localhost;Integrated Security=True;Persist Security Info=False;Pooling=False;MultipleActiveResultSets=False;Connect Timeout=60;Encrypt=False;TrustServerCertificate=False</TargetConnectionString>
    <BlockOnPossibleDataLoss>True</BlockOnPossibleDataLoss>
    <ProfileVersionNumber>1</ProfileVersionNumber>
  </PropertyGroup>
</Project>

But I’d highly recommend using the GUI in Visual Studio instead of trying to make a file on your own.

Update the Default Branch Name in Azure Data Studio

While performing my videos, I realized that I hadn’t updated my default branch name from the previous value. This week’s video walks you through the process of renaming your default branch from master to main. This video not only covers how to update GitHub but also how to update your local GitHub project. Once again, we’re using Azure Data Studio to work with our Database Projects. You can also use Visual Studio, if you prefer.

Deploy Database Projects Manually

Earlier in this series, I showed you how to create a Database Project using Azure Data Studio (ADS) and how to add a Database Project to GitHub. Getting a database project setup and into source control is often the beginning. I’ve found one of the hardest things to do is to keep your database in source control. This is often because it hard to keep using source control when you don’t have a way to deploy your changes.

This week, we’ll talk about one of the easier ways to deploy your database changes. One of the benefits of database projects is that they can generate data-tier applications (DAC). The data-tier applications can be bundled into what is called a DACPAC. This is a collection of files that can be used to deploy your database.

DACPACs are definitely not the only way to deploy your database. However, they are one of the main ways to deploy database projects. If you’re using to running a series of script files for your deployments, this is going to be a bit different. When deploying database projects, the database project becomes the source of truth. Generally speaking, whatever is in the database project will be deployed to the SQL Server instance.

If you’re like to figure out how you can manually deploy your database projects, you can check out the video below. If you’re looking for a more automated method to deploy database code, check back over the next couple of weeks where I’ll start showing you how you can use Azure DevOps to build and deploy your database code.

Add Database Project to GitHub

Two weeks ago, I showed you how to create a Database Project using Azure Data Studio (ADS). If you’re starting your database journey with source control and/or Continuous Integration/Continuous Delivery (CICD), that’s the first step you’ll need to complete. However, getting a database project created is just the beginning!

Once you have the database project created, you’ll want to get your database project added to source control so that you (and others) can modify and manage your database code. This next step is the beginning of allowing you and others to work on the same databases and minimize the risk of overwriting someone else’s work or deploying the wrong code to Production.

This week’s YouTube video will show you how you can get your database project to GitHub. If you’re using Azure DevOps, the process should be similar.

YouTube video showing how to add a database project to GitHub

Keep following this blog, and I walk through how to setup a process to automatically build and deploy your database code.

Use Azure Data Studio to Clone a Repo from GitHub

One of the more common questions I’ve gotten is how can Database Administrators and Database Engineers embrace database source control without using Visual Studio. I have to admit, a good part about DevOps, automation, and database source control involves using tools that people will embrace.

In this week’s YouTune video, I show how you can clone (import) a repository (database code) from GitHub all within Azure Data Studio. This is a great feature that helps make database source control more accessible to individuals who may not have access or be comfortable using Visual Studio or VS Code.

You will need to be on at least Azure Data Studio (ADS) 1.25.1. In addition, you’ll want to download Git on your machine as well as install the Database Project extension in ADS. This guide is designed for those individuals using SQL Server Data Tools (SSDT).

The YouTube video can be found here or by clicking on the image below.

Thumbnail for YouTube video: Database Projects with GitHub

Creating a Database Project in Azure Data Studio using an existing database

Check out this video on how to quickly get your databases into Database Projects. If you’re wanting to save a copy of your SQL Server database into a version control system, this is one of the first things you’ll need to do.

Database Projects can also help organize your T-SQL code that is used to make database objects. In this video, I show you how to create a database projects that groups T-SQL scripts by schema and then by database object type. No more wondering if that object is a view or table!

It’s the end of the year, and I’ve decided the best way to prepare for the New Year is to start my goals early. One of the things I’ve wanted to do all year is start a video series on how you can improve your database deployment process.

There are all sorts of fancy names that get used including source control, version control, continuous integration, continuous delivery, DevOps, etc… My goal here is make this process easy for you.

When I first started talking about source control and automated deployments, I lived in a world of Visual Studio. I was a data professional, the only real language I know is T-SQL. But I soon learned there were alot of challenges in getting my data peers to embrace using Visual Studio.

Thankfully, Azure Data Studio has created the Database Projects extension to solve this problem. Instead of giving you pages of screenshots, I have a video that will walk you through creating your first Database Project using Azure Database Studio and one of your existing databases.

I’ve created a quick YouTube video to walk you through the process. Check it out and let me know what you think! https://www.youtube.com/watch?v=DgfMycbBQUU

T-SQL Tuesday #133 – Filling in the Gaps

T-SQL Tuesday Logo - A blue database with a calendar showing days of the week wrapped artfully around the database.

I’m so glad Lisa Bohm (b|t) is hosting this month’s T-SQL Tuesday. I’m also thankful to Steve Jones for hosting T-SQL Tuesday and helping hosts share their ideas with the SQL community. I also want to thank Adam Machanic for starting T-SQL Tuesday as a blog party. I know I often have a difficult time coming up with topics, and it helps to be provided a topic.

Let’s get back to this month. Lisa says, “I’d like those of you who have presented, or written a presentation, to share something technical THAT DID NOT RELATE to the topic of the presentation”. I’m glad to respond to this month’s blog topic. The main reason is, like Lisa, I want to encourage people to present and share their experiences. One of the best things happened to me as a result of presenting, and it was something that I needed.

A couple of years ago, the company I was at was planning a data center migration using distributed Availability Groups. My boss at the time, Rie Irish (b|t), encouraged me to get a SQL Saturday presentation together. At the time, one of my largest concerns as a DBA was that I was not familiar with Server Manager. I often felt like I didn’t belong as a DBA because I came from what I perceived as a non-conventional route.

For those of you data professionals that came to this profession from outside of IT, I understand you. I have a B.B.A. in Accounting, and when I started my career, I was in business operations. I was introduced MS Access, and I was mesmerized with databases. I eventually found that if I wanted real database action, I needed to work with technology with Microsoft SQL Server.

Fast forward to this presentation for distributed Availability Groups. Prior to this presentation, I barely used PowerShell, and I had no idea how to create VM. But thanks to the encouragement I received, I figured out how to build multiple VMs using PowerShell. I made Domain Controllers and all that jazz. Honestly, I think all the things I had to do to create distributed Availability Group really helped me build confidence I didn’t realize I was missing.

Now I accept that I don’t need to know everything about infrastructure and networking. I still want to learn. But I am comfortable with Server Manager, building VMs, and I have a passing understanding of Domain Controllers. But really, the best thing I learned is that I don’t have to know all the answers. What I love is automating database deployments and understanding how applications interact with databases. And that’s ok!