Archive for the 'Programming' category

Docker containers for PHP with PHPDocker.io

published on March 27, 2018.

This past weekend I was playing around on some pet projects and wanted to get up and running quickly. My initial reaction was to reach for a Vagrant box provisioned with Ansible. After all, that’s what I’ve been using for a really long time now.

Recently I’ve been also learning a bit more about Docker, so I figured maybe this pet project would be a good project to replace the standard Vagrant set up and go with Docker instead. When it comes to Docker, by now I believe I have a fairly good understanding of how it works and how it’s meant to be used in a development environment. I’ve learned a lot about it from Vranac, as well as poked around it on my own.

While trying to write a set of Docker files and Docker compose files for this project, I thought there must be an easier way to do this… And then I remembered that some time ago I came across a generator to generate Docker environments for PHP projects: PHPDocker.io. As they state on their website:

PHPDocker.io is a tool that will help you build a typical PHP development environment based on Docker with just a few clicks. It supports provisioning of the usual services (MySQL/MariaDB, Redis, Elasticsearch...), with more to come. PHP 7.1 is supported, as well as 7.0 and 5.6.

Click-click-click… done.

What I like about PHPDocker is that it takes a couple of clicks and filling out a couple of text fields to get a nice zip with all the things needed to get a project up and running. It has support for a “generic” PHP application, like Symfony 4, Zend Framework and Expressive, or Laravel, as well as for applications based on Symfony 23, or Silex. PHP versions supported range from 5.6 to 7.2 and a variety of extensions can be additionally enabled. Support for either MySQL, MariaDB, or Postgres is provided, as well as a couple of “zero-config” services like Redis or Mailhog.

The zip file comes with a phpdocker directory that holds the configurations for the specific containers such as the nginx.conf file for the nginx container. In the “root” of the zip there’s a single docker-compose.yml file which configures all the services we told PHPDocker we need:

docker-compose.yml

###############################################################################
#                          Generated on phpdocker.io                          #
###############################################################################
version: "3.1"
services:

    mysql:
      image: mysql:5.7
      container_name: test-mysql
      working_dir: /application
      volumes:
        - .:/application
      environment:
        - MYSQL_ROOT_PASSWORD=root
        - MYSQL_DATABASE=test
        - MYSQL_USER=test
        - MYSQL_PASSWORD=test
      ports:
        - "8082:3306"

    webserver:
      image: nginx:alpine
      container_name: test-webserver
      working_dir: /application
      volumes:
          - .:/application
          - ./phpdocker/nginx/nginx.conf:/etc/nginx/conf.d/default.conf
      ports:
       - "8080:80"

    php-fpm:
      build: phpdocker/php-fpm
      container_name: test-php-fpm
      working_dir: /application
      volumes:
        - .:/application
        - ./phpdocker/php-fpm/php-ini-overrides.ini:/etc/php/7.2/fpm/conf.d/99-overrides.ini

Run docker-compose up in the directory where the docker-compose.yml file is located and it’ll pull and build the required images and containers, and start the services. The application will be accessible from the “host” machine at localhost:8080, as that’s the port I defined I wanted exposed in this case. You can see that in the ports section of the webserver service.

One thing I noticed is that the mysql service doesn’t keep the data around, but that can be fixed by adding a new line to the volumes section of that service: ./data/mysql:/var/lib/mysql. The mysql service definition should now read:

    mysql:
      image: mysql:5.7
      container_name: test-mysql
      working_dir: /application
      volumes:
        - .:/application
        - ./data/mysql:/var/lib/mysql
      environment:
        - MYSQL_ROOT_PASSWORD=root
        - MYSQL_DATABASE=test
        - MYSQL_USER=test
        - MYSQL_PASSWORD=test
      ports:
        - "8082:3306"

Other than this, I didn’t notice any issues (so far) with the environment.

Inside the phpdocker folder there’s also a README file that provides additional information how to use the generated Docker environment, what services are available on what port, a small “cheatsheet” for Docker compose, as well as some recommendations how to interact with the containers.

Happy hackin’!

Bounded contexts and subdomains

published on March 20, 2018.

Back in October last year I wrote that I thought I understood bounded contexts, what they are and why we need them. Ever since realizing that a bounded context is a boundary of how a business sees a specific subject within a section of that business, learning anything and everything DDD became a lot easier.

I see bounded contexts as a big building block without which learning other parts of DDD is pretty much pointless. OK, pointless might be too harsh a word, but to be able to use entities, value objects, domain events, aggregates to their full potential, a good understanding of bounded contexts is needed.

Of course I didn’t stop learning about DDD since writing that post 5 months ago. If anything, I did my best to learn even more by reading books and articles, watching recorded conference talks, and thinking about this subject. A lot.

Uh, phrasing

In my previous post I had this image attached that I used to help explain the difference between a Book in two different bounded contexts. Recently I realized that that image has mistake in it. On that image I used the terms “Publisher” and “Seller” to distinguish the two bounded contexts. A better name for those would probably be “Publishing” and “Selling”.

It is important to get the naming right as it affects the understanding a great deal. It might not be an outright mistake, but a bounded context is probably better off not being named after a thing. Publisher, seller, warehouse, these are the things we model inside our bounded contexts. A bounded context should name in what context do these models apply: publishing, selling, warehousing. Properly naming a bounded context will also help to identify should a model of something be an aggregate (root), an entity, or a value object.

There are probably other things I got wrong in that post, but so far I see this naming issue as the biggest one.

What about subdomains?

One thing I didn’t know and understand when I was writing the previous post was the importance of (sub)domains in connection with bounded contexts. I’m still not 100% sure I do. A business operates within a domain and that’s what we’re trying to design and model with DDD. It has one core domain which represents the reason why the business exists in the first place and at least one more subdomain that supports that core domain. The core domain is the main problem a business is trying to solve and the subdomains are all the other problems that come along with trying to solve the core domain problem.

Vaughn Vernon in his “Implementing Domain-Driven Design” book states (I’m paraphrasing here a bit) that “the subdomains live in the problem space and the bounded contexts in the solution space”. It took me a while to really understand this and what the implications of it are.

When writing software that will support the business and help solving the problems coming from the core domain and supporting subdomains we create models. These models will be “fine tuned” so that they provide the most optimal solution for the problem. But to provide these solutions, we also need to say what is the context of these models in which they help solve the problem.

Imagine a software that is being developed to support a dentist. A dentist has two problems: fixing patients’ teeth and making appointments for the patients. Fixing teeth is the core domain and making appointments is a supporting subdomain. In the core domain the medical staff cares about a patient’s dental history, can they handle general anesthesia or not, what their current problem is, etc. In the subdomain the staff (not necessarily medical staff) cares about a patient’s contact information, a date and a time that best suits both the doctor and the patient, the type of dental work needed, etc. Both domains need a model of a patient, but that model will depend on the bounded context we put in place to ensure the correct information and features are available when solving the problems of each domain.

Subdomains and bounded contexts go hand in hand and I think one can’t be understood without the other. The optimal solution would be to have one bounded context in one subdomain. The world is not a perfect place, software even less so, so it might happen that one bounded context spans multiple subdomains, or that one subdomain has multiple bounded contexts.

Problems and solutions

A key element to DDD is that our understanding of the domain will constantly change, improve, as we learn more about it. That’s one of the reasons why we need to be ready to change or throw away models we came up with. This ever-evolving state means that the subdomains and the bounded contexts can and will change, too.

A bounded context can grow or shrink, split in two, be combined in one, regardless of the subdomain(s) it is in. Taking a different approach in solving a problem doesn’t mean that the problem itself has changed.

On the other hand if the problem changes, the solution should change too. If during development we realize that the core domain can be further split into a smaller, more focused core domain and a new subdomain then the solution to those problems should change. Most likely the models we developed don’t fit the problems any more and the boundaries around our contexts have moved.

This post has evolved, too

Initially this wasn’t the post I wanted to write. I did start writing about bounded contexts, but then I realized I can’t talk about them without talking about subdomains. Both the title and the contents changed at least 3 times. More topics to cover in the future I guess.

Happy hackin’!

Mockery partial mocks

published on January 28, 2018.

In dealing with legacy code I often come across some class that extends a big base abstract class, and the methods of that class call methods on that big base abstract class that do an awful lot of things. I myself have written such classes and methods in the past. Live and learn.

One of the biggest problems with this kind of code is that it is pretty hard to test. The methods from the base class can return other objects, have side effects, do HTTP calls…

A typical example of this would be a base model class that has a getDb() method:

AbstractModel.php

<?php

abstract class AbstractModel
{
    protected $db = null;

    protected function getDb()
    {
        if ($this->db == null) {
            $db = Config::get('dbname');
            $user = Config::get('dbuser');
            $pass = Config::get('dbpass');
            $this->db = new PDO('mysql:host=localhost;dbname='.$db, $user, $pass);
        }
        return $this->db;
    }
}

which can be called in child classes to get access to the database connection:

ArticleModel.php

<?php

class ArticleModel extends AbstractModel
{
    public function listArticles()
    {
        $db = $this->getDb();
        $stmt = $db->query('SELECT * FROM articles');

        return $stmt->fetchAll();
    }
}

If we want to write unit tests for this listArticles() method, the best option would probably be to refactor the models so that the database connection can be injected either through the constructor, or with a setter method.

In case refactoring is not an option for whatever reason, what we can do is to create a partial mock of the ArticleModel using Mockery and then mock (well, stub to be more precise) out only the getDb() method that will always return a mocked version of the PDO class:

tests/ArticleModelTest.php

<?php

use Mockery\Adapter\Phpunit;

class ArticleModelTest extends MockeryTestCase
{
    public function testListArticlesReturnsAnEmptyArrayWhenTheTableIsEmpty()
    {
        $stmtMock = \Mockery::mock('\PDOStatement');
        $stmtMock->shouldReceive('fetchAll')
            ->andReturn([]);
        $pdoMock = \Mockery::mock('\PDO');
        $pdoMock->shouldReceive('query')
            ->andReturn($stmtMock);

        // Create a partial mock of ArticleModel
        $articleModel = \Mockery::mock('ArticleModel')->makePartial();

        // Stub the getDb method on the ArticleModel
        $articleModel->shouldReceive('getDb')
            ->andReturn($pdoMock);

        // List all the articles
        $result = $articleModel->listArticles();
        $expected = [];

        $this->assertSame($expected, $result);
    }
}

When we tell Mockery to make a partial mock of a class, any method on that partially mocked class that has expectations set up will be mocked, but calls to other methods Mockery will pass through the real class. In other words, even though the ArticleModel is a partial mock, anytime we call the listArticles() method Mockery will pass that call to the original method, and only the calls to the getDb() method are being mocked.

Using partial mocks should probably be an option of a last resort and we should always aim to refactor code to be easier for testing, but there are cases when they can really help us in testing legacy code.

Happy hackin’!

Build and run Golang projects in VS Code

published on January 24, 2018.

I’ve been using VS Code for my Golang development needs for a few months now. Minor kinks here and there, nothing serious, and the development experience gets better with every update. I have also tried out IntelliJ Idea as the editor, and one feature that I’m missing in Code from Idea is the build-run-reload process. I thought to myself, that’s such a basic feature, it should be possible to have that.

And it is! VS Code Tasks to the rescue!

These tasks allow us to run different kind of tools and, well, tasks inside VS Code.

Go to Tasks -> Configure Default Build Task and then select the “Create tasks.json file from template” in the little pop-up window, and after that select the “Others” option. This tasks.json file will live inside the .vscode directory.

For my overcomplicated d20 roller, which is my first website built with Golang, I have the following content for the tasks:

.vscode/tasks.json

{
    "version": "2.0.0",
    "tasks": [
        {
            "label": "Build and run",
            "type": "shell",
            "command": "go build && ./d20",
            "group": {
                "kind": "build",
                "isDefault": true
            }
        }
    ]
}

What this one task does is that it runs go build to build the project and then runs the generated executable, which for this project is d20.

I guess providing a standardized name to go build with the -o flag this could be made more portable so that the command part reads something like go build -o proj && ./proj, but I’m ok with this for now.

And now just type Ctrl+Shift+b and Code will execute this “Build and run” task for us! Hooray! The terminal window in Code will pop-up saying something like:

> Executing task: go build && ./d20 <

By going to http://localhost:8080 I can see that my little website is up and running. Cool.

But if we want to restart this taks, running Ctrl+Shift+b again won’t work and Code will complain that the “Task is already active blablabla…”.

Looking at the Tasks menu, we can see that there’s a “Restart running task…” menu entry. Clicking that, it pops up a window with a list of running tasks. In this case there’s only one, our “Build and run” task. Clicking through the menu every time would be boring, so let’s add a keyboard shortcut for it.

Go to File -> Preferences -> Keyboard shortcuts (or just hit Ctrl+k Ctrl+s), search for “Restart running task” keybinding, and set it to whatever you like. I’ve set it to Ctrl+Alt+r.

Finally, the flow is Ctrl+Shift+b to start the taks for the very first time, code-code-code, Ctrl+Alt+r to re-build. Sweet. Now the only annoying bit is that I have to pick out that one running task from the list of running tasks when restarting, but I can live with that. For now.

Happy hackin’!

Buffered vs. unbuffered channels in Golang

published on December 25, 2017.

As any newcomer to Golang and it’s ecosystem, I was eager to find out what is this hubbub about these things called goroutines and channels. I read the documentation and blog posts, watched videos, tried out some of the “hello world” examples and even wasted a couple of days trying to solve the puzzles for day 18 from Advent of Code 2017 using goroutines and failed spectacularly.

All this was just… doing stuff without actually understanding of when, how, and why use goroutines and channels. And without that basic understanding, most of my attempts at doing anything that resembles a useful example ended up with deadlocks. Lots and lots and lots of deadlocks.

Ugh. I decided to stop forcing myself to understand all of this concurrency thing and figure it all out once I actually need it.

Few days later, I start reading the description for the first puzzle day 20. The gist of the problem is that we have many particles “flying” around on three axes: X, Y, and Z. Each particle has a starting position, a starting velocity, and a constant acceleration. When a particle moves, the acceleration increases the velocity of the particle, and that new velocity determines the next position of the particle. Our task is to find the particle that will be the closest to the center point after every particle has moved the same number of times.

Then there’s this one sentence in the description:

Each tick, all particles are updated simultaneously.

Could it… could it be? A puzzle where goroutines can be applied to do what they were meant to be doing? Only one way to find out, and that’s to try and use goroutines to solve the puzzle.

Prior knowledge

Up until this point I knew the following things about goroutines and channels:

  • goroutines are started with the go keyword,
  • we must wait on the gouroutines to be done, otherwise they fall out of scope and things sort of leak,
  • the waiting can be done with either sync.WaitGroup or with a “quit” channel,
  • I prefer the sync.WaitGroup approach as it is easier for me to follow it,
  • goroutines communicate through channels, by sending and receiving a specific data type on them,
  • there are buffered and unbuffered channels, but other than it’s pretty much ¯\_(ツ)_/¯
  • I think it’s not possible to send and receive to the same channel in the same goroutine?

Setting up the code

From the puzzle’s description we know we have a bunch of particles, every particle has an ID from zero to N, every particle has a position, velocity, and acceleration, and we want to know the particle’s destination from the center. I went ahead and made a struct like this:

type Particle struct {
	id int
	p  coord
	v  coord
	a  coord
	d  int
}

The coord is again a struct that holds the XYZ coordinates and has two methods on it, nothing fancy. When the particle moves, we add the acceleration to the velocity and the velocity to the position. Again, nothing interesting there.

That’s pretty much all the setup I had.

Time to move

With all the prior knowledge I had, my idea was to move every particle in a separate goroutine for 10000 times, send every particle over a channel to somewhere where I can compare them and find the one that’s the closest to the center, and use sync.WaitGroup to orchestrate the waiting on the goroutines to move the particles. The 10k is just a random number that seemed high enough so all particles are given enough time to move.

The first iteration looked like something the following:

func closest(particles []Particle) Particle {
	var cp Particle
	var wg sync.WaitGroup
	wg.Add(len(particles))
	pch := make(chan Particle)

	for _, particle := range particles {
		go move(particle, 10000, pch, &wg)
	}

	wg.Wait()
	close(pch)
    return cp
}

cp will hold the closest particle, there’s the wg to do all the waiting, the pch channel to send particles over it.

For every particle in the slice of particles I tell it to “split out” in a goroutine and move that Particle for 10000 times. The move function moves the particle for the given number of iterations, sends it over the pch channel once we’re done moving it, and tells the wait group that it’s done:

func move(particle Particle, iterations int, pch chan Particle, wg *sync.WaitGroup) {
	for i := 0; i < iterations; i++ {
		particle.move()
	}
	pch <- particle
	wg.Done()
}

So far this seems like something that could work. Running this code as is, results in “fatal error: all goroutines are asleep - deadlock!“. OK, sort of make sense that it fails, we only send to the particle channel, we never receive from it.

Send and you shall receive

So… let’s receive from it I guess:

func closest(particles []Particle) Particle {
    // snip...
	for _, particle := range particles {
		go move(particle, 10000, pch, &wg)
	}

    p := <-pch
    log.Println(p)

	wg.Wait()
	close(pch)
    return cp
}

Surprise! “fatal error: all goroutines are asleep - deadlock!“. Errr…

Oh, right, can’t send and receive to a channel in the same goroutine! Even though, receiving is not in the same goroutine as sending, lets see what happens if we do receive in one:

func closest(particles []Particle) Particle {
    // snip...
	for _, particle := range particles {
		go move(particle, 10000, pch, &wg)
	}

    go func() {
        p := <-pch
        log.Println(p)
    }()

	wg.Wait()
	close(pch)
    return cp
}

Guess what? “fatal error: all goroutines are asleep - deadlock!” This should totally be possible, I’m doing something wrong!

Matej mentioned something on Twitter the other day that buffered channels can send and receive inside the same goroutine. Let’s try a buffered channel, see if that works.

Maybe buffers is what we need after all

When creating the pch channel, we pass in a second argument to make, the size of the buffer for the channel. No idea what’s a good size for it, so let’s make it the size of the particles slice:

func closest(particles []Particle) Particle {
    // snip...
	pch := make(chan Particle, len(particles))

	for _, particle := range particles {
		go move(particle, 10000, pch, &wg)
	}

    p := <-pch
    log.Println(p)

	wg.Wait()
	close(pch)
    return cp
}

Run it and… A-ha! One particle gets logged! There’s no comparing of particles in there yet, so it must be the first particle that was sent on that channel. OK, let’s range over it, that’ll do it:

func closest(particles []Particle) Particle {
    // snip...
	for _, particle := range particles {
		go move(particle, 10000, pch, &wg)
	}

    for p := range pch {
        log.Println(p)
    }

	wg.Wait()
	close(pch)
    return cp
}

“fatal error: all goroutines are asleep - deadlock!” motherf… gah! Fine, I’ll loop over all the particles and receive from the channel, see what happens then:

func closest(particles []Particle) Particle {
    // snip...
	for _, particle := range particles {
		go move(particle, 10000, pch, &wg)
	}

    for i := 0; i < len(particles); i++ {
		p := <-pch
		log.Println(p)
	}

	wg.Wait()
	close(pch)
    return cp
}

Bingo! All particles logged, no deadlocks. Throw in a closure to find the closest particle and we’re done!

// snip...
    var findcp = func(p Particle) {
        if cp.d == 0 || particle.d < cp.d {
			cp = particle
		}
    }
    for i := 0; i < len(particles); i++ {
		p := <-pch
		findcp(p)
	}
// snip...

For my input I get that the answer is Particle number 243, submit it to Advent of Code and it’s the correct answer! I did it! I used goroutines and channels to solve one puzzle!

But… how?

I made it work, but I still didn’t understand how and why does this work. Time to play around it with some more.

I’ve seen code examples using range to range over a channel and use whatever is received from the channel to do something with it. Countless blog posts and tutorials, I’ve never seen a “regular” for loop and in it receiving from the channel. There must be a nicer way to achieve the same. Re-reading a couple of the articles, I spot the error I made in the range approach; I was closing the channel too late:

func closest(particles []Particle) Particle {
    // snip...
	for _, particle := range particles {
		go move(particle, 10000, pch, &wg)
	}

	wg.Wait()
	close(pch)

    for p := range pch {
        findcp(cp)
    }

    return cp
}

Particle number 243! After some thinking about it, it makes sense, or at least this is how I made it make sense:

Lesson number 1 — golang’s range doesn’t “like” open-ended things, it needs to know where does the thing we range over begin and where does it end. By closing the channel we tell range that it’s OK to range over the pch channel, as nothing will send to it any more.

To put it in another way, if we leave the channel open when we try to range over it, range can’t possibly know when will the next value come in to the channel. It might happen immediately, it might happen in 2 minutes.

Next step is to try and make it work using unbuffered channels. If I just make this buffered channel an unbuffered one, but otherwise leave the working code as-is, it blows up with a deadlock. Something something same goroutine.

Back to the example where I tried to receive in a separate goroutine:

    // snip..
    go func() {
        p := <-pch
        log.Println(p)
    }()
    // snip...

Ah, this goroutine runs only once. It even prints out a single particle at the beginning, I just didn’t look closely enough the first time, all I saw was the deadlock error message and moved on. I should probably loop over the channel somehow and receive from it in that loop.

I remembered reading something about a weird looking for/select loop, let’s try writing one of those:

    // snip..
    go func() {
        for{
			select {
			case p := <-pch:
				findcp(p)
			}
		}
    }()
    // snip...

Particle number 243! Yey! And again, after giving it some thought, this is how I explained this unbuffered channel version to myself:

Lesson number 2 — an unbuffered channel can’t hold on to values (yah, it’s right there in the name “unbuffered”), so whatever is sent to that channel, it must be received by some other code right away. That receiving code must be in a different goroutine because one goroutine can’t do two things at the same time: it can’t send and receive; it must be one or the other.

So far whatever I threw at these two lessons learned, they’re standing their grounds.

Buffered or unbuffered?

Armed with these two bits of new knowledge, when to use buffered and when to use unbuffered channels?

I guess buffered channels can be used when we want to aggregate data from goroutines and then do with that data some further processing either in the current goroutine or in new ones. Another possible use case would be when we can’t or don’t want to receive on the channel at the exact moment, so we let the senders fill up the buffer on the channel, until we can process it.

And I guess in any other case, use an unbuffered channel.

And that’s pretty much all I learned from this one Advent of Code puzzle. Here’s my final solution using buffered channels and here’s the one using unbuffered channels. I even figured out other ways to make it work while writing this blog post, but those solutions all come from understanding of how channels work.

If you spot any errors in either my working code examples or in my reasoning, please let me know. I want to know better. Thanks!

Happy hackin’!

Robert Basic

Robert Basic

Software engineer, consultant, open source contributor.

Let's work together!

If you require outsourcing or consulting help on your projects, I'm available!

Robert Basic © 2008 — 2018
Get the feed