Sunday, 19 July 2020

[min heaps] Jesse and Cookies

The UC San Diego course on coursera has been really helping me learn a lot about Data Structures. The latest thing I've learned is about binary heaps, and priority queues. So, I've been solving some data structures problems on the side on HackerRank in order to gain a further understanding. One such problem is Jesse and Cookies.

This problem is really simple if you know what a min heap is. The way to do that in C++ (STL) is make a priority queue with a custom comparison function:

std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap;
std::priority_queue<int, std::vector<int>> maxHeap;

We need a minHeap here. What we do is this:

1. Push all the cookies into the minHeap
2. initialize integer operations = 0
3. while root of minHeap < k:
          int newSweetness = minHeap.getMin()
          minHeap.ExtractMin()
          newSweetness += 2 * minHeap.getMin()
          minHeap.ExtractMin()
          minHeap.Push(newSweetness)
          increment operations by 1
4. return operations

Why does this work?
Due to the property of the min Heap data structure we know that each node is >= to its parent. This means the smallest node is the root, and if that is >= k, every single node in the heap is >= k.

So for each iteration we check if the root is >= k. If not, we take the smallest node and the second smallest nodes out, and put in a new node with sweetness = (smallest) + 2*(second smallest)

The moment the root becomes >= k, we know we are done!

Wednesday, 15 July 2020

[stacked up odds] Two stacks, one queue

Task sounded simple. Implement a queue with 2 stacks, which as the methods Push(), Pop() and Front().

Well, the fundamental problem here is stack cannot give me the element at the bottom in a neat way. It has to be dirty. Also, how do I keep track of Pop() operations on the queue? 2 stacks. I need to keep track of elements in the inverse order of my stack.

Okay here's something: what if the reason we require two stacks is that we maintain, for each operation, a stack which is the right way up, and a second stack which has elements in the reverse order as the first? The first stack helps us push elements into the queue. The second stack helps with Pop() and Front(). I will call the first stack as the front stack, and second as the rear stack

Let's be naive first. Let us update the two stacks every single time any of the queue methods are called. The major testcase eater here is that there is no way to nicely invert a stack. I went with declaring a temporary stack, copying the elements of the front stack to it (std::stack has a nice copy constructor for this), and then running a while loop and pushing the top of the temporary stack to the rear stack and then popping the temporary stack, so that elements of the front stack get pushed onto the rear stack in reverse order. That was a mouthful. A lot of pushes and pops. But hopefully you got the idea.

If we do all this, and repeat it (update both stacks for every queue method call) this eats up a LOT of time. Half the testcases pass with this one.

I couldn't figure this part out, so I went to the discussion section. There I found what I had been missing. In order to explain this, I must make an analogy with what we do for dynamic arrays.

Dynamic Arrays, as you might know, are resizable arrays (std::vector for C++). Now note the parallel. There is no neat way of resizing a proper C++ array. It has to be dirty. We are going to have to re-allocate a new array, copy all the elements to the new array. There is simply no other way, as that is how the data structure works fundamentally.

So what do we do? We do it less often. Only when necessary.
Dynamic Arrays achieve a PushBack() complexity of amortized O(1) by doubling the array capacity when it is full. Similarly, we must do this reversal of stack and copying of stack elements only when it is needed.

What can change given this idea?
First, the rear stack need not be updated if it is not empty! The only time it needs to update is if it is empty. Because if we have elements left in the rear stack, the top() is still correct, no matter what we have pushed. Similarly the front stack doesn't need to still contain all the pushed elements. Once it updates rear stack, the front stack's job is essentially nothing, apart from storing the new incoming elements. Once it pushes those to the rear stack, it doesn't need to care about the elements that were transferred to the rear stack.
Therefore, that the temporary stack was unnecessary. We can directly pop from the front stack while updating the rear, instead of first making a copy.

We update the rear stack only on calls to Pop() or Front(). We needn't do so for Push(). Similarly we need to update the front stack only on calls to Push().

With all this done, the solution now passes all the testcases.
The lesson I learned was this: If there is doesn't exist any neat way of doing something. If there is something you just cannot bypass due to the fundamental nature of the tools you use to solve it, it has to be dirty. It can be done more efficiently by doing the dirty work only and only when required, and nowhere else.

PS: I pity those who noticed the pun in the title ._.

Monday, 6 July 2020

Between the Brackets #1 : Practice vs. Lectures



Between the Brackets is a new thing I'm starting on this blog. It's a series where we have discussions over the non-technical things in coding, my experiences, in general, coding and chill.

So let's get to the point. Which helps you more in improving your coding skills? A lot of practice, or watching lectures, from sources like Coursera or YouTube or MIT OCW?

I, so far have relied on sheer practice, and reading up things based on the questions I faced, as an when I feel I need to read it. This has definitely improved my coding skills, and indeed that's what my friends and seniors encouraged me do.

So we started this Team Resurgence thing, tri-weekly tournaments, discussions, and reading on demand. But it slowly fizzled out. Participation dropped from 10 to 3-4 people. Later, we simply stopped.

So its been a bit. A month. A very transformative month indeed. After two whole months of solving tournaments every other day, I got burnt. It was inevitable. I decided to double down a bit.

Recently, I started a course on Data Structures from UC San Diego on Coursera. I'm done with week one, and it's a breath of fresh air. One important lesson I have learnt is that coding is very hard if you do it only through trial and error. And while there are dozens of YouTube channels which can seem very appealing, they are often incomplete, since most such YouTubers are also employed (those who make coding videos) in a 9 to 5, so they do not make their videos "complete". After the amount of trial and error based learning I did, this course felt way better and had less hand holding for the basics. I already knew most basics already, so it was less annoying.

Now, about platforms like GeeksForGeeks. I find these platforms to be very cookbook style, and indeed they are useful, but I don't recommend this platform as a place for first time learners. I mean, you might find quick code and documentation if you forget something, but the explanations are very brief, and the accompanying video also is not very good. So I use GeeksForGeeks to figure out what to learn, or to refresh memory.

Now we come to MIT OCW.
MIT OCW is a great place to learn. I have done Multivariable Calculus from OCW and it helped me get the highest grade in the MA1101 course we have here at IIT Madras.
So naturally my first instinct was to start MIT 6.006 Discrete maths for CS. And that course was so hard that it made me quit it multiple times for weeks. I feel, specially in India, the Engineering admission exams are heavily trick based and focus on Calculus and Geometry, and less on Algebra and Combinatorics. So having trained so hard on continuous mathematics, it might take a while to start activating your discrete math spider sense.

Yes, I said spider sense. Concepts like induction and combinatorics require you to recognize patterns and try to propose solutions based heavily on past experiences. So beginners have to learn to not feel bad about being completely lost. I know, it sucks. But suck it up. Suck up the knowledge you gain by failing repeatedly. (I think I'm being a bit overly positive and hypocritical when I say this. I'm no expert, and I definitely feel really dumb sometimes while solving these. Also, I haven't yet mastered that art of not feeling bad about being lost. The point is we have to embrace this together, dear reader.)

So coming back to courses. I feel we must have a balance of the two. And auditing courses on Coursera and EdX is definitely better than relying on YouTube and MIT OCW. I'm going to have to stick to this course, and not leave it midway. In general, experiment. Pick and choose which course/channel you prefer. Your choice may not be mainstream, but if it works for you, go ahead and do it.

And that's what is Between the Brackets this week.