theBecwar.net: October 2017

Tuesday, October 31, 2017

Spoopy - A Photography Tuesday Special

Welcome back to photography Tuesday! In honor of the Halloween holiday, I present something spooky: an abandoned house. More after the image.

Abandoned House by Mark Becwar on 500px.com

This is a fairly heavily modified version of the image that I shot. In Photoshop, I removed the saturation, increased the contrast, decreased the exposure, and added some grain, in an effort to make the image look like an old newspaper photo. I didn't get exactly the look that I was going for, but I can't be an engineer about everything, so eventually I have to stop tweaking and publish the image. Shot on a Canon EOS 5D Mark III, through an EF 50mm f/1.8 wide angle lens.

Friday, October 20, 2017

A brief study in optimization (Part 4)

Alright y'all, welcome back! Last time we said we were going to look at the algorithm we wrote in a different way.

Why did we even do the exercise last week, then? Are you a sadist?

The approach we took with the merge algorithm, is the way we're going to approach the algorithm as a whole. Rather than loop to generate triangle numbers, then loop to generate square numbers, then, finally, loop to find matching pairs, we're going to loop once and generate the triangle and square numbers as we need them.

That sounds complicated. I don't like complicated.

It's really not that bad, but, if you didn't understand part 3, you may want to go back and look at it again. I'm going to proceed on the assumption that the logic made sense to you.

One basic way to optimize something, as we saw last time, is to eliminate unnecessary loops. We can use the approach we looked at last time for merging, to combine the entire thing into one main loop. In fact, the only difference from part 3 to here, is that we'll be generating the next triangle and square number on demand. With all that said, here's the code:

triangle_square_numbers = []
n = 1
m = 1
current_triangle = 1
current_square = 1
while current_triangle < 10000000 and current_square < 10000000:
  if current_triangle == current_square:
    triangle_square_numbers.append(current_triangle)
    n = n + 1
    m = m + 1
    current_triangle = current_triangle + n
    current_square = m * m
  elif current_triangle < current_square:
    n = n + 1
    current_triangle = current_triangle + n
  else: # current_triangle > current_square
    m = m + 1
    current_square = m * m

But did it make a difference?

Well here's the timing of this approach compared to the original naive approach, and the approach from last week:

Naive Algorithm: 0.439 seconds
Improved Merge: 0.003 seconds
Combined Approach: 0.001 seconds

Which we can see is quite an improvement on the original approach.

That's it for this series. If you liked it, let me know. I may consider doing this again.

Thursday, October 19, 2017

Morning Lake (Photography Tuesday)

Welcome back for another round of Photography Tuesday. The day where I share a photo, and maybe give some information about it, and try to trick some poor fools on Twitter into reading it. (Hi, three people from Twitter!)

Morning Lake by Mark Becwar on 500px.com

Today's installment is Lake Bethany, about a half hour after sunrise. Nothing too complex about the image or framing. Canon EOS 5D Mark III through a EF 24-105mm f/4L IS USM. 60mm, ƒ/22, 1/60s, ISO 800. See y'all next week for a special shot in honor of Haloween.

Tuesday, October 17, 2017

Debugging: Don't Believe Everything You Read

Recently saw a debugging blog* talking about how to track down unmanaged memory leaks. Clearly this was written by someone who lives in a managed-only world. They talk about allocating a buffer in native code. They start by showing an example of what _not_ to do:

void DoSomething(int size)
{
    char* initialString = new char[size];
    // do stuff
}

Clearly this is wrong. The memory is allocated, and leaked when the function exits. As a proposed fix, the author offers this solution:

void DoSomething(int size)
{
    char* initialString = new char[size];
    // do stuff
    delete initialString;
}

This is much, much worse. Now instead of just leaking the string, you've invoked undefined behavior, and the compiler is free to do all kinds of funky things. For those that aren't familiar with C++, when you create an array with new, you must use delete[]. Freeing the memory any other way is undefined behavior. You may get lucky- the compiler may take pity on you, and fix your mistake. More likely, it won't.

If you're playing along at home, that means that our now corrected code is:

void DoSomething(int size)
{
    char* initialString = new char[size];
    // do stuff
    delete[] initialString;
}

It's a simple rule: use the method for freeing memory that goes with the method that you used to allocate it. (malloc -> free, new -> delete, new[] -> delete[], LocalAlloc -> LocalFree, CoTaskMemAlloc -> CoTaskMemFree, etc.)

This is, however, still wrong. This is looking at the solution as a C programmer, not a C++ programmer. The C++ compiler is smart enough to generate all of the memory management code for you. You shouldn't use new, unless you have a really good reason. The best solution is to not use a raw pointer at all, and use the standard library objects to help.

void DoSomething(int size)
{
    std::string initialString(size, '\0');
    // do stuff
}

Now, the compiler will generate code that cleans up initialString as soon as it goes out of scope. (In more general parlance this is RAII, or Resource Acquisition Is Initialization.) Why is this helpful? Let's say you modify 'do stuff' to return early in an error case. Did you remember to delete that string? If not, you have a leak that happens intermittently, and it's now much harder to debug.

If you do need to work with a raw pointer (like if you're programming with COM on Windows) use a smart pointer type to help you manage it. Here's a slightly more complex example:

void DoSomething()
{
    IUIAutomation* pAutomation;
    IUIAutomationElement* pElement;
    
    HRESULT hr = CoInitialize(NULL);
    if (FAILED(hr))
        return;
        
    hr = CoCreateInstance(CLSID_CUIAutomation, NULL,
           CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&pAutomation));
    if (FAILED(hr))
        return;
    
    hr = pAutomation->GetRoot(&pElement);
    if (FAILED(hr))
    {
        if (pAutomation)
            pAutomation->Release();
    }
    
    // do something with pElement
    if (FAILED(hr))
    {
        if (pElement)
            pElement->Release();
        if (pAutomation)
            pAutomation->Release();
    }
    
    // do something else with pElement
    if (FAILED(hr))
    {
        if (pElement)
            pElement->Release();
        if (pAutomation)
            pAutomation->Release();
    }
    
    // do some more stuff
    if (pElement)
        pElement->Release();
    if (pAutomation)
        pAutomation->Release();
    return;
}

With smart pointers (in this case ATL) we can simplify that code to:

void DoSomething()
{
    CComPtr<IUIAutomation> pAutomation;
    CComPtr<IUIAutomationElement> pElement;
    
    HRESULT hr = CoInitialize(NULL);
    if (FAILED(hr))
        return;
        
    hr = CoCreateInstance(CLSID_CUIAutomation, NULL,
           CLSCTX_INPROC_SERVER, IID_PPV_ARGS(&pAutomation));
    if (FAILED(hr))
        return;
    
    hr = pAutomation->GetRoot(&pElement);
    if (FAILED(hr))
        return;
    
    // do something with pElement
    if (FAILED(hr))
        return;
    
    // do something else with pElement
    if (FAILED(hr))
        return;
        
    // do some more stuff
    return;
}

We don't need to mess with releasing the pointers manually, the CComPtr object handles it for us automatically.

*No, I'm not going to name-and-shame. That's not what this post is about.

Graffiti (Photography Tuesday)

Its Tuesday, which means that I dip into my collection of photographs and share something with you, the 5 people who accidentally clicked the link on Google/Twitter!

Today its some Graffiti on a train speeding past.

Grafitti by Mark Becwar on 500px.com

Taken with my trusty Canon EOS 5D Mark III, through an EF 24-105mm f/4L IS USM, at 105mm, ƒ/4, 1/8000s, ISO 2000.

The high speed exposure allowed me to freeze the motion of the train as it passed at 15-20 MPH, but it does show an issue with the way that the shutter works. The shutter is what's called a rolling shutter, meaning that the top of the sensor is exposed slightly before the bottom. This causes a slight skewing of vertical lines in the image in the direction of motion. Fortunately, the angle of the picture, and the relatively slow speed of the train, make the effect less noticeable.

Friday, October 13, 2017

A brief study in optimization (Part 3)

You're back!

Of course I am. We are the same person after all.

Touche. So when we left off in part 2, we had an algorithm that could get us all triangle square numbers below 100,000, but it was very slow. How slow? Not really all that slow. (About 4 milliseconds on my PC on average.)

But you said that it was slow!

Yes, it is. Try modifying it so that we find all triangle square numbers below 10,000,000 or even 100,000,000, and you'll see how slow it gets. Which leads us to the first rule of optimization.

What's the first rule of optimization, smart guy?

The first rule of optimization is don't.

Wait, so this series is a sham?

No. The general rule with optimization is that you don't optimize unless it's necessary. If we only wanted to generate the triangle square numbers below 10,000 then the algorithm we implemented in part 2 is just fine. But we want more. So the first step is to actually time our code, and get an idea how fast or slow it is, and where there are bottlenecks. So that it's clearer, I modified the code to look for all triangle square numbers below 10,000,000. I ran the code several times, and here are the results:

Average Time to Complete: 0.443 seconds
Generate Triangle Numbers: 0.0
Generate Square Numbers: 0.0
Find Triangle Square Numbers: 0.443

Obviously, the merge step at the end is the dominant bottleneck in this process, and a little analysis will tell us why.

For each item in the list of triangle numbers, we look at every single number in the list of square numbers until we're found a matching number. In the worst case (a triangle number that's not a square number) we have to iterate through the entire list of square numbers.

That sounds inefficient.

Because it is. Lets look at another way of merging these lists, and see if we can eliminate the nested for loop. Here's the code for the new merge:

triangle_square_numbers = []
count_triangles = len(triangle_numbers)
count_squares = len(square_numbers)
idx_triangle = 0
idx_square = 0
while idx_triangle < count_triangles and idx_square < count_squares:
  if triangle_numbers[idx_triangle] == square_numbers[idx_square]:
    triangle_square_numbers.append(triangle_numbers[idx_triangle])
    idx_triangle = idx_triangle + 1
    idx_square = idx_square + 1
  elif triangle_numbers[idx_triangle] < square_numbers[idx_square]:
    idx_triangle = idx_triangle + 1
  else: # triangle_numbers[idx_triangle] > square_numbers[idx_square]
    idx_square = idx_square + 1

The key of this approach is the while loop:

while idx_triangle < count_triangles and idx_square < count_squares:

This while loop looks a bit complicated, but it allows us to loop through both arrays simultaneously. It's like a for loop with two index variables.

The actual functionality is the three conditions. We know that given two numbers A and B, that one of the following conditions will be true:

A = B
A < B
A > B

The first condition is the easiest. If the two numbers are equal, then it's a triangle square number. We store the number in our list of results, and increment both indices, since we've 'used' the number from both.

The second condition means that the current triangle number is less than the current square number, and so we can move to the next triangle number.

The third condition is the opposite of the second. We have finished with this square number, so we move on to the next.

We know at the end that we haven't missed any, because the two lists are in numeric order.

That makes zero sense. Can you explain it so I can try it with real physical objects?

Ok, try this:

Take two suits from a deck of cards.
Separate the two suits into two piles.
Randomly remove five cards from each pile.
Sort each pile into ascending order (A - 2 - 3 - ... - J - Q - K), face up.
If the card's values match, take the card from the left pile and place it face down in a third pile. (This is the result pile)
If the value of the card on the left is less than the value of the card on the right, discard the card on the left.
If the value of the card on the right is less than the value of the card on the left, discard the card on the right.
If you have cards left in both piles, go back to step 5. If not, you're done.

(For bonus points, use the same card method to try out the previous implementation.)

I see how it works now, but what did it do to our execution time?

I'm glad you asked. Looking just at the merge part of the algorithm, we can see a marked change in execution time:

Old Method 0.446
New Method 0.002

You can see that the time is a bit better. Next time we'll look at a better way™

Tuesday, October 10, 2017

Warehouse (Photography Tuesday)

Happy Tuesday! Since it's Tuesday that means it's time, once again, to share a picture. Today, the cavernous emptiness of a warehouse.

Warehouse by Mark Becwar on 500px.com

Nothing really special about how this was taken. Canon EOS 5d MK III through an EF 24-105mm f/4L IS USM. Taken at 105mm, f/4, 1/60th, ISO 1000.

Friday, October 6, 2017

A brief study in optimization (Part 2)

Ok so we know what a triangle square number is.

Wait, What?

You again. Go back and read part 1, I'll wait.

Ok, I'm done. What's the next step?

Well today, we're going to write a quick program to try and generate some triangle square numbers. It will be the starting point for our explorations. (Don't leave a comment about how ugly the design is. That's the point.) I'm going to use Python 3 for these examples, because it's available to everyone, and the syntax is pretty easy to follow along with.

I don't want to install anything. Your blog has like 2 articles on it, and it's not even all that good.

You don't need to install Python on your system. One option is repl.it, but there are loads of other options. I'd recommend biting the bullet and installing Python and IPython, to make your life that much easier, but it's your call.

Anyway, Can I get back to business?

Sure.

Thanks. Today's approach is going to be to generate a list of triangle numbers less than 100,000, a list of square numbers less than 100,000 and then look at both lists to find numbers that appear in both places. First, let's generate the list of triangle numbers.

n = 1
current_triangle = 0
triangle_numbers = []
while current_triangle < 100000:
    current_triangle = current_triangle + n
    n = n + 1
    triangle_numbers.append(current_triangle)

Next we need to generate a list of square numbers. The logic will be pretty similar:

n = 1
current_square = 0
square_numbers = []
while current_square < 100000:
    current_square = n * n
    n = n + 1
    square_numbers.append(current_square)

Now that we've generated the lists of triangle and square numbers, we need to go through and find numbers that are the same in both: (This is a really naive merge algorithm. We'll look at better options coming up later in the series.)

triangle_square_numbers = []
for triangle in triangle_numbers:
    for square in square_numbers:
        if triangle == square:
            triangle_square_numbers.append(triangle)
            break

print(triangle_square_numbers)

Run all of this code, and you'll see that there are 4 triangle square numbers less than 100,000. [1, 36, 1225, 41616] You may also notice that it's painfully slow.

Yeah, man, what gives?

We'll do the analysis of this algorithm next time.

Wait, You're going to leave us hanging? I know you wrote all of these at the same time!

Yeah, but people don't want to be overloaded. See you next time.

theBecwar.net