project_resurgence: 2020

Monday, 21 September 2020

[The Essence of Binary Search] Aggressive Cows

Ahh, a long and tiring day nears close, and boy have I learned a lot today!

Today we are going to talk about binary search, and more importantly how to sniff out binary search problems where the binary search part is absolutely not obvious. At least to me. Here is today's one lone problem I've tried today. I legit spent about 6 hours exploring this problem and the concepts related to it.

Aggressive Cows on SPOJ

Binary Search. And interesting and fast (O(logn)) method to look for values in a binary tree or a sorted array. But it is so much more than that. Let's dive into some mathematics:

At its core, what is Binary Search? At each step, we check the truth value of a statement (most of the time it is "is the value >= the number to be searched?"), and depending on that value, we decide whether we have found our element, or we go left, or right in our journey to hunt down the searched value.

What we do with binary search is find the maximum value which satisfies our statement.

(since maximum value >= x is x provided x exists in the array, we use this to search for x)

So essentially,

"Binary search converts an optimisation problem to a decision problem"

Our proposition is true for all cases before the answer, and false after it. So our array essentially turns into a TF sequence like this:

TTTTTTFFFFFFFFFFFF

Let us refer to this type of sequence as a TF sequence from now on.

And the mission of binary search is to find the last T.

Therefore the functionality of binary search can be modified to find the least (or conversely the greatest) element which fulfills a proposition P, which has to generate a sequence of T's followed by a sequence of F (or conversely, a sequence of F followed by a sequence of T).

This idea is so powerful, that we can exploit this in a LOT of seemingly O(n²) solutions to solve it in O(nlogn).

Let us define our proposition

P_x(k) : "It is possible to arrange x cows in the barn such that distance between them is at least k"

Before using this, we must prove that P_x(k) yields a TF sequence.

Proof:

If P_x(k) is true, we have a situation like this:

COW.....k.....COW

Just shift that cow one step closer. COW.....(k-1).....COW

Therefore P_x(k) --> P_x(k - 1)

But the same need not be true for P_x(k + 1) because if k = n and x = 2, both cows occupy the very ends, and P_x(n + 1) is not possible as the barn is not that long.

We have established, that if we find a valid P_x(k₀), P_x(k) is also valid for all the k <= k_0,

Therefore, P_x(k) generates a TF sequence over k = 1 to n.

Now, we could do this:

Evaluate the truth value of P_x(k) for each k, and obtain the TF array, then find the last T. That would be O(n). Totally, it would be O(n²), as we evaluating P_x(k) as k goes from 1 to n.

But why do that?

If P_x(k) yields a TF array, we can use binary search to find the last T.

We evaluate P_x(k) only at THAT k which is required by binary search to obtain the answer.

Barn in the example testcase:

0 1 2 3 4 5 6 7 8 9

And we have a set of k's for which we want to find the greatest k for which P_x(k) is true:

k = {1, 2, 3... 9}

define start = 0, end = 9. (n = 9, x = 3)

We find mid = k = floor( (start + end)/2 ). We check if P_x(k) is valid for that k (=mid).

If we get T, we can look further to the right, as there may be more T's there.

If we hit F, there DEFINITELY isn't any T to the right, so we go left.

start = 0, end = size of the barn (last index in barn array)

while (start < end):

if (checkP_x(mid))

start = mid + 1

else

end = mid

¹Figure out what you would return when this loop finishes executing!

Now how to check P_x(mid)?

Again, we use greedy approach to do this:

checkP_x(k):

Put the first cow in first stall.

Move along the barn until you find a stall which is at least k

If you find one:

Put a cow there

Repeat until a) there are no cows left -> return true

b) the barn ends with cows still remaining -> return false

Ok now we know all the pieces of the puzzle!

Answer to question 1: would return start - 1, as start would become the first F, so decrement it by 1 to get the last T

Now, its time for some code:

#include <iostream>
#include <vector>
#include <algorithm>

bool checkMyK(int k, int cows, std::vector<int>& barn) {
	int currentPos = 1;
	int prevCowPos = 0;
	int numCowsPlaced = 1;
	while (currentPos < barn.size()) {
		if (barn[currentPos] - barn[prevCowPos] >= k) {
			numCowsPlaced++;
			if (numCowsPlaced == cows) {
				return true;
			}
			prevCowPos = currentPos;
		}
		currentPos++;
	}
	return false;
}

int modifiedBinarySearch(std::vector<int>& barn, int cows) {
	int start = 0, end = barn[barn.size() - 1];

	while (start < end) {
		int mid = (start + end) / 2;
		if (checkMyK(mid, cows, barn)) {
			start = mid + 1;
		}
		else {
			end = mid;
		}

		//std::cout << start << " " << end << "\n";
	}

	return start - 1;
}

int main() {
	int t;
	std::cin >> t;

	while (t--) {
		int n, cows;
		std::vector<int> barn;
		std::cin >> n >> cows;

		while (n--) {
			int val;
			std::cin >> val;
			barn.push_back(val);
		}

		std::sort(barn.begin(), barn.end());

		std::cout << modifiedBinarySearch(barn, cows) << "\n";
	}
}

Saturday, 22 August 2020

Longest Path In a Tree

I recently tried out SPOJ (Sphere Online Judge), and its pretty good for practicing algorithms. Here is a pretty simple SPOJ question.

The tree is undirected, meaning the connections can be given in any order.

The general idea is this. Start at any node, and traverse in depth first order, keeping track of the depth of the subtree under each node. We will also make sure we do not end up going in circles. Then, we just take the deepest child of the root, the second deepest child of the root, and add their depths and add 2 more edges : ones which connect the root to the two deepest children. There are a few edge cases. Try to think which, and how you'll handle them.

For trees, I really prefer to make a class and solve the question. I actually have a basic class ready for Binary Search Trees with all types of traversals (bread first, depth first and inOrder). I just had to copy that class in, and modify it for n-ary trees. Here is the class

#include<iostream>
#include<vector>
#include<queue>

struct Node {
	int val;
	int parent, left, right;
};

class BST {
private:
	std::vector<Node> tree;
public:
	Node* root = nullptr;

	BST(std::vector<Node> _tree) :tree(_tree){
		if (_tree.size() != 0) {
			root = &tree[0];
		}
		else {
			root = nullptr;
		}
	}

	void InorderTraversal(Node* root) {
		if (root == nullptr) {
			return;
		}
		if (root->left != -1) {
			InorderTraversal(&tree[root->left]);
		}
		std::cout << root->val << " ";
		if (root->right != -1) {
			InorderTraversal(&tree[root->right]);
		}
	}

	void PreOrderTraversal(Node* root) {
		if (root == nullptr) {
			return;
		}
		if (root->left != -1) {
			InorderTraversal(&tree[root->left]);
		}
		if (root->right != -1) {
			InorderTraversal(&tree[root->right]);
		}
		std::cout << root->val << " ";
	}

	void BreadthFirstTraversal(Node* root) {
		if (root == nullptr) {
			return;
		}
		std::queue<Node*> Q;
		Q.push(root);
		while (!Q.empty()) {
			std::cout << Q.front()->val << " ";
			if (Q.front()->left != -1) {
				Q.push(&tree[Q.front()->left]);
			}
			if (Q.front()->right != -1) {
				Q.push(&tree[Q.front()->right]);
			}
			Q.pop();
		}
	}
};

I modified this class for n-ary trees by removing the left, right and parent data in Node, and replacing that with a

std::vector<int> neighbors;

this stores the index of all the nearest neighbors of that node, and since the tree is non-directional, if there exists an edge between 1 and 2, 1 appears in 2's neighbor list, and 2 appears in 1's neighbor list. So how do we avoid going in circles?

We add another boolean variable called "visited" to the node struct, which will be false by default.

bool visited  = false;

We will be using a depth first search in order to sift through the vertices, and find the depth. So let us add another data point to our struct which keeps track of the maximum depth of the subtree under a vertex.

int maxDepthOfSubtree = 0;

While we do the DFS, we increase the depth as we move back up from the deepest point of the tree, meaning after our recursive call. So this variable is updated from the bottom up, not top to bottom, which is opposite to the direction of out DFS.

Our DFS first of all sets the visited status of the current node to true, then recursively scans each of it's non-visited neighbors. When the recursion returns, the value of maxDepthOfSubtree will be correct for all the neighbors, and so we can just add 1 to the maximum neighbor, and assign this value to maxDepthOfSubtree of the current node.

for (int child : root->children) {
	if (!tree[child].visited) {
		DFS(&tree[child]);
		if (root->depthOfSubtree < tree[child].depthOfSubtree + 1) {
			root->depthOfSubtree = tree[child].depthOfSubtree + 1;
		}
	}
}

Now to return our answer, we simply compare the maxDepthOfSubtree of all the neighbors of our first node (which we will call the root), and add the two maximum subtree depths, and add 2 to them, edge cases being zero children and 1 child.

int MaxDist() {
	DFS(root);
	std::vector<int> depths;
	for (int child : root->children) {
		depths.push_back(tree[child].depthOfSubtree);
	}

	std::sort(depths.begin(), depths.end(), std::greater<int>());

	if (depths.size() >= 2) {
		return depths[0] + depths[1] + 2;
	}
	else if (depths.size() == 1) {
		return depths[0] + 1;
	}
	else {
		return 0;
	}
}

To be honest there isnt anything special about the root, unlike Binary Search Trees. Here, the root can be ANY node and it'll work. But since it IS definitely a tree, any node can be made the root, and it will remain a tree. That's why it works.

Tuesday, 18 August 2020

[keeping bribes minimum] New Year Chaos

This question was particularly nasty for me, because the logic was so elegant, simple, yet elusive. My 3rd semester has started here at IIT Madras, and there is no mercy, so I haven't been coding too much recently. Maybe that's the cause, but a better reason might be that its a solid 65% success rate problem.

So, what's the deal?

My incorrect approach:

At every spot, Q[i] SHOULD have been equal to i + 1, but if it's not, the element there has either value greater or smaller than i + 1. We figure out the number of swaps by incrementing a swap counter by

swapCounter += Q[i] - i - 1

only if this quantity is greater than zero, and then I reasoned that the ones where it is smaller than zero must have been swapped with some greater element. I thought that the smaller elements were the RESULT of a swap with an element strictly greater than itself. This logic misses cases like

3 2 1

My algorithm will totally ignore 2 and 1, but in reality in order to achieve such a configuration, we must also switch 1 and 2 at some point, and that's where my algorithm fails. I realized this, and tried to work around this base solution to accommodate this case, but finally, failed.

The Correct Approach:

A very simple, elegant, no fancy math way of keeping track of swaps is to update 3 variables

expectedFirst, expectedSecond, expectedThird.

Initially, these are 1, 2, 3 respectively. However, we only update all 3, only the later 2, or only the 3rd one selectively depending upon the value we encounter at the queue.

    int expectedFirst = 1;
    int expectedSecond = 2;
    int expectedThird = 3;
    
    for (unsigned int i = 0; i < q.size(); ++i) {
        if (q[i] == expectedFirst) {
            expectedFirst = expectedSecond;
            expectedSecond = expectedThird;
            ++expectedThird;
        } else if (q[i] == expectedSecond) {
            ++totalBribes;
            expectedSecond = expectedThird;
            ++expectedThird;
        } else if (q[i] == expectedThird) {
            totalBribes += 2;
            ++expectedThird;
        } else {
            cout << "Too chaotic" << endl;
            return;
        }
    }

Ok. Let's break this down. What's the difference between this?

At each step it compares the current number in the queue to what it expects to be. That is the smallest possible number, the next smallest, and so on.

When it encounters

3 2 1

1. it sees 3, updates totalBribes by 2, then sets the expected values to (1, 2, 4)

2. it sees 2, updates totalBribes by 1, then sets the expected values to (1, 3, 4)

3. it sees 1, updates totalBribes by 0 (as 1 == 1), then sets expected values to (3, 4, 5)

It needs to keep track of only 3 because the max number of bribes which can be given by any one person is 2, and one spot for the number itself.

There! Now that's elegance. I wonder when I will acquire such godlike abilities ;-;

I guess the correct way is "I never lose, I win, or I learn."

I've definitely learned today.

Sunday, 19 July 2020

[min heaps] Jesse and Cookies

The UC San Diego course on coursera has been really helping me learn a lot about Data Structures. The latest thing I've learned is about binary heaps, and priority queues. So, I've been solving some data structures problems on the side on HackerRank in order to gain a further understanding. One such problem is Jesse and Cookies.

This problem is really simple if you know what a min heap is. The way to do that in C++ (STL) is make a priority queue with a custom comparison function:

std::priority_queue<int, std::vector<int>, std::greater<int>> minHeap;

std::priority_queue<int, std::vector<int>> maxHeap;

We need a minHeap here. What we do is this:

1. Push all the cookies into the minHeap
2. initialize integer operations = 0
3. while root of minHeap < k:
          int newSweetness = minHeap.getMin()
          minHeap.ExtractMin()
          newSweetness += 2 * minHeap.getMin()
minHeap.ExtractMin()
          minHeap.Push(newSweetness)
increment operations by 1
4. return operations

Why does this work?
Due to the property of the min Heap data structure we know that each node is >= to its parent. This means the smallest node is the root, and if that is >= k, every single node in the heap is >= k.

So for each iteration we check if the root is >= k. If not, we take the smallest node and the second smallest nodes out, and put in a new node with sweetness = (smallest) + 2*(second smallest)

The moment the root becomes >= k, we know we are done!

Wednesday, 15 July 2020

[stacked up odds] Two stacks, one queue

Task sounded simple. Implement a queue with 2 stacks, which as the methods Push(), Pop() and Front().

Well, the fundamental problem here is stack cannot give me the element at the bottom in a neat way. It has to be dirty. Also, how do I keep track of Pop() operations on the queue? 2 stacks. I need to keep track of elements in the inverse order of my stack.

Okay here's something: what if the reason we require two stacks is that we maintain, for each operation, a stack which is the right way up, and a second stack which has elements in the reverse order as the first? The first stack helps us push elements into the queue. The second stack helps with Pop() and Front(). I will call the first stack as the front stack, and second as the rear stack

Let's be naive first. Let us update the two stacks every single time any of the queue methods are called. The major testcase eater here is that there is no way to nicely invert a stack. I went with declaring a temporary stack, copying the elements of the front stack to it (std::stack has a nice copy constructor for this), and then running a while loop and pushing the top of the temporary stack to the rear stack and then popping the temporary stack, so that elements of the front stack get pushed onto the rear stack in reverse order. That was a mouthful. A lot of pushes and pops. But hopefully you got the idea.

If we do all this, and repeat it (update both stacks for every queue method call) this eats up a LOT of time. Half the testcases pass with this one.

I couldn't figure this part out, so I went to the discussion section. There I found what I had been missing. In order to explain this, I must make an analogy with what we do for dynamic arrays.

Dynamic Arrays, as you might know, are resizable arrays (std::vector for C++). Now note the parallel. There is no neat way of resizing a proper C++ array. It has to be dirty. We are going to have to re-allocate a new array, copy all the elements to the new array. There is simply no other way, as that is how the data structure works fundamentally.

So what do we do? We do it less often. Only when necessary.

Dynamic Arrays achieve a PushBack() complexity of amortized O(1) by doubling the array capacity when it is full. Similarly, we must do this reversal of stack and copying of stack elements only when it is needed.

What can change given this idea?

First, the rear stack need not be updated if it is not empty! The only time it needs to update is if it is empty. Because if we have elements left in the rear stack, the top() is still correct, no matter what we have pushed. Similarly the front stack doesn't need to still contain all the pushed elements. Once it updates rear stack, the front stack's job is essentially nothing, apart from storing the new incoming elements. Once it pushes those to the rear stack, it doesn't need to care about the elements that were transferred to the rear stack.

Therefore, that the temporary stack was unnecessary. We can directly pop from the front stack while updating the rear, instead of first making a copy.

We update the rear stack only on calls to Pop() or Front(). We needn't do so for Push(). Similarly we need to update the front stack only on calls to Push().

With all this done, the solution now passes all the testcases.

The lesson I learned was this: If there is doesn't exist any neat way of doing something. If there is something you just cannot bypass due to the fundamental nature of the tools you use to solve it, it has to be dirty. It can be done more efficiently by doing the dirty work only and only when required, and nowhere else.

PS: I pity those who noticed the pun in the title ._.

Monday, 6 July 2020

Between the Brackets #1 : Practice vs. Lectures

Between the Brackets is a new thing I'm starting on this blog. It's a series where we have discussions over the non-technical things in coding, my experiences, in general, coding and chill.

So let's get to the point. Which helps you more in improving your coding skills? A lot of practice, or watching lectures, from sources like Coursera or YouTube or MIT OCW?

I, so far have relied on sheer practice, and reading up things based on the questions I faced, as an when I feel I need to read it. This has definitely improved my coding skills, and indeed that's what my friends and seniors encouraged me do.

So we started this Team Resurgence thing, tri-weekly tournaments, discussions, and reading on demand. But it slowly fizzled out. Participation dropped from 10 to 3-4 people. Later, we simply stopped.

So its been a bit. A month. A very transformative month indeed. After two whole months of solving tournaments every other day, I got burnt. It was inevitable. I decided to double down a bit.

Recently, I started a course on Data Structures from UC San Diego on Coursera. I'm done with week one, and it's a breath of fresh air. One important lesson I have learnt is that coding is very hard if you do it only through trial and error. And while there are dozens of YouTube channels which can seem very appealing, they are often incomplete, since most such YouTubers are also employed (those who make coding videos) in a 9 to 5, so they do not make their videos "complete". After the amount of trial and error based learning I did, this course felt way better and had less hand holding for the basics. I already knew most basics already, so it was less annoying.

Now, about platforms like GeeksForGeeks. I find these platforms to be very cookbook style, and indeed they are useful, but I don't recommend this platform as a place for first time learners. I mean, you might find quick code and documentation if you forget something, but the explanations are very brief, and the accompanying video also is not very good. So I use GeeksForGeeks to figure out what to learn, or to refresh memory.

Now we come to MIT OCW.

MIT OCW is a great place to learn. I have done Multivariable Calculus from OCW and it helped me get the highest grade in the MA1101 course we have here at IIT Madras.

So naturally my first instinct was to start MIT 6.006 Discrete maths for CS. And that course was so hard that it made me quit it multiple times for weeks. I feel, specially in India, the Engineering admission exams are heavily trick based and focus on Calculus and Geometry, and less on Algebra and Combinatorics. So having trained so hard on continuous mathematics, it might take a while to start activating your discrete math spider sense.

Yes, I said spider sense. Concepts like induction and combinatorics require you to recognize patterns and try to propose solutions based heavily on past experiences. So beginners have to learn to not feel bad about being completely lost. I know, it sucks. But suck it up. Suck up the knowledge you gain by failing repeatedly. (I think I'm being a bit overly positive and hypocritical when I say this. I'm no expert, and I definitely feel really dumb sometimes while solving these. Also, I haven't yet mastered that art of not feeling bad about being lost. The point is we have to embrace this together, dear reader.)

So coming back to courses. I feel we must have a balance of the two. And auditing courses on Coursera and EdX is definitely better than relying on YouTube and MIT OCW. I'm going to have to stick to this course, and not leave it midway. In general, experiment. Pick and choose which course/channel you prefer. Your choice may not be mainstream, but if it works for you, go ahead and do it.

And that's what is Between the Brackets this week.

Monday, 25 May 2020

[Learning DSA series!] Trees: PreOrder Traversal

So, I was starting to get pretty bored of those standard array based questions. Graphs and trees seem mysterious and interesting. I don't want to pretend that this is a tutorial. It's not even close.

I'm just going to describe what I learn, my experiences, where I struggled and what not. If anyone at all is reading this, you will probably get some doubts cleared, if our doubt seems to match, and also get a sense that CS is hard. A lot of people online cast an image of being natural geniuses, and so-called "normies" like us feel like we can never catch up. It's mutual. I just want the guy struggling with trying to master coding like me, to feel better, and know that it's hard for everyone and not just for them. And, prolly share the fascination and interest in coding.

As CarryMinati would say "Shuru karte hain bina kisi bakchodi ke":

First, some terminology.

Tree: A data structure consisting of interconnected nodes (data points), like the root system of a tree.

Root: The topmost node of a tree, which has no parent (a node above its level to which it is connected).

Leaf node: A node with no children (nodes below its own level which are connected to it)

Binary Tree: A tree where each node has at most two children. A node can have none, one or two children.

So, let's get on with implementing our binary tree code. This is already given in the HackerRank question, you don't have to type this, but nevertheless it will be a good exercise to try and write one ourselves.

If you are not very comfortable with OOP, I'll try to make it as digestible as possible.

Let's start with the Node class. A class consists of some data (which can be of differing types) and some functions (called methods). Each node has 3 data components:

info: the type has to be fixed for certain languages like C++, although there are ways to generalize the data types in C++. Might want to look up Template Specialization. This is not a problem at all in Python, where data types are implicit. For our purposes, we will keep it at int or float values.
reference to left node
reference to right node

The node class does not need any methods. Let's code this.

class Node:

    def __init__(self, info):

        self.info= info

        self.left = None

        self.right = None

Remember that a constructor is the function which initializes a new object of the class, and we can use this to assign some values to our class data. In Python, __init__ function designates the constructor.

Every method in a Python which messes with class data has to have a "self" parameter, which is a reference to the self. This is passed automatically while initializing a new object, and we don't need to pass it manually. This is something exclusive to Python.

Now lets get to our BinarySearchTree Class. This class will contain only one data point: a ROOT node. This node will be the start of our tree. It will be an object of the Node Class we just made. What's important here is the method we are going to make to populate a tree with values.

This is a binary search tree. This means, there are two children to each node (left and right child) and the left child will store a value smaller than the parent node, and the right will store a value greater than the parent node. Let's code this:

class BinaryTree:

    def __init__(self):

        self.root = None

    def create(self, val):

        if self.root == None:

            self.root = Node(val)

        else:

            current = self.root

            while True:

                if val < current.info:

                    if current.left:

                        current = current.left

                    else:

                        current.left = Node(val)

                        break

                elif val > current.info:

                    if current.right:

                        current = current.right:

                    else:

                        current.right = Node(val)

                        break

                else:

                    break

Ok. Let's explain the create() method. If the root has not yet been given a value (empty tree) then give it a node with the given value (val). If the tree is already populated, compare val with the value of the current node (the first time it will be the root). If the value in the current node is greater than val, we need to go left. If there is no left child, we make one. If the left child already exists, then we set current node to be the left child and repeat the process of comparing val with the current node (which is now the left child). We do the same with the right side case too. This will ensure we go to a node which needs a child to accommodate the new val, but doesn't already have one. Then, we make the node there (which is the appropriate spot due to all the discussion so far) and we are done!

Now after all this, let's get to the problem itself. It asks us to print out the elements in a depth first order, preferring the left hand side (hence the "preoder" traversal). Depth first means that we go as far as possible taking left. When we reach a node where we can no longer go any further by turning left, we explore all the paths taking one right at a time bottom to top. Maybe this program I made in Python using graphics.py can help:

So we are going to go for a recursive approach. The first thing we do, the print the info value of the current node. Then we check if the node has a left child. If it does, we recursively call the same function for currentNode.left. Then, we check if the right child exists. If it does, we send currentNode.right in another recursive call. We exit the function if there are no children to a node (i.e. leaf node).

Here is the code:

def preOrder(root):

    print(str(root.info), end = " ")

    if root.left != None:

        preOrder(root.left)

    if root.right != None:

        preOrder(root.right)

And that's all for now. Phew. Super long article. We will explore more trees in the next one! Until then...

Sunday, 24 May 2020

[TLE fix] Pairs

This question is just as one would expect it. The first thought you get to solve it will probably be the right one.

For each element arr[i], we search for arr[i] + k in the array, and whenever it is found, we increment a counter variable.

But the twist here is that arr.index() is linear search, and is an O(n) function. Since we are putting this in a loop, the naive O(n²) solution gives a TLE.

Enter the bisect package. To use it we have to add this line at the top of our code:

from bisect import bisect_left

What does this do? Gives us binary search out of the box.
The bisect package contains 2 particularly useful functions:
bisect_left(array, element) returns the index of leftmost occurrence element in array.
bisect_right(array, element) returns the index of rightmost occurrence element in array.

Binary search is an O(logn) algorithm. Therefore our solution is O(nlogn) and it now passes all the testcases.

Note that the array must be sorted in order for binary search to work. So before calling bisect_left, we must sort the array.

Friday, 15 May 2020

[deceptively nasty] Palindrome Index

This problem is prima facie really easy looking. But its a snake waiting to bite!

What I did initially was to initialize two variables, i = 0 and j = len(s) - 1 (start and end of s).
On each iteration of a while loop (while i < floor(len(s) / 2)), compare values s[i] and s[j] and if they are equal, we are good.

But if they are not equal, one of the two things can occur. Either s[i] must be the odd letter, or s[j]. For this the natural approach is to check the next letter, and see if they match, that is:

1. compare s[i + 1] and s[j]
2. compare s[i] and s[j - 1]

If 1. is true, then s[i] must be removed, and if 2 is true, s[j] must be removed.

Easy, right?
WRONG!

Consider the following testcases:

cwwcw
wcwwc

The above logic will give the wrong output for these cases.

It turns out, conditions 1. and 2. are NOT sufficient for determining which letter is the odd one out. We must also check the next pair of elements as well. The correct conditions are:

1. if s[i] == s[j - 1] and s[i + 1] == s[j - 2]
2. if s[i + 1] == s[j] and s[i + 2] == s[j - 1]

These correctly determine which out of s[i] and s[j] is the odd one out. Deceptively nasty indeed!

Thursday, 14 May 2020

[a little intuition] Max Min

Sometimes, one must trust one's intuition. In coding problems, intuition is a really useful thing to build.

When I opened this problem, I had no idea whatsoever how to approach this problem.

Suddenly a random thought popped into my head:
What if I sorted the array, and took the first k elements?
Immediately I could think of counterexamples to disprove this notion.

1, 10, 11, 12, 13 and k = 3

But all was not lost. We could still improve upon this. What if we looked at k length contiguous subsets of this?

1, 10, 11 | 10
10, 11, 12 | 2
11, 12, 13 | 2

And eureka! We could find the minimum difference this way.

Eureka indeed. Why does this work? If you remove all the contest pressure I had, it is plain that if the array is sorted, the difference between consecutive elements is the smallest possible, while still being non-negative. For a valid contiguous subset, we are finding the sum of the differences between consecutive elements (a bit of a mouthful there).

If the consecutive differences are minimum, then the cumulative difference between the first and last elements (min and max) is minimum, and that, solves the problem.

Friday, 8 May 2020

[incremental progress :)] The Full Counting Sort

The first medium level question in my HackerRank journey! And yes, it was pretty hard for me. But the logic was easy to think of. Just hard to implement.

This was a higher level version of Jim and the Orders (whose solution is available here: https://teamresurgence.blogspot.com/2020/05/new-concept-jim-and-orders.html).

Here too, the Hashmap came in handy. Ultimately we have to sort the strings by their keys. But here we have to set it up to make the sorting "stable" (read the question). So here is what I came up with:

1. Accept the string, and it's key. Search for the key in an array called "takenKeys", which stores all the unique keys so far. If the key is found (meaning it is a duplicate), we will modify it to ensure stable sorting. Here they have put loophole of replacing the first half of strings by '-'. While accepting the strings itself, check if the the index of the loop which is accepting the strings is smaller than floor(n / 2) and simply append '-' instead of the accepted string in that case. A literal loophole!

2. This "modification" is simple. For each unique key, I have a "factor" (all these factors for each key are stored in a different array) The factor for each key is initialized to 0.1. Whenever a duplicate key is detected, say 4, I add the factor to it. (4 -> 4.1) and then factor = factor + factor/10000. Why 10000? I actually tried, 2, 10, 100, 1000, 10000. 10000 managed to pass all the testcases. This happens because lower values can "overflow". Meaning the sums of factors can add up to a value larger than 1. If 4 becomes 5 or more, instead of 4 < x < 5, then our sorting will mess up.
So if our keys were originally like this:
1, 3, 0, 1, 0, 1
they now become (if factor = factor + factor / 10, for simplicity):
1, 3, 0, 1.1, 0.1, 1.11
After sorting:
0, 0.1, 1, 1.1, 1.11, 3
Now it should be clear why the sorting is stable.

3. Now send this dictionary, with keys being the modified keys and the values being associated strings. Sort the dictionary by keys (unlike Jim, where we sorted by values). And now, print the values!

Sunday, 3 May 2020

[playing with combinations] Smart Number

A very interesting problem, leading to a result that seems counter intuitive till you prove it. The difficulty is that intuition doesn't encourage one to try and prove it.

Consider any number n.
It has prime factors having varying powers, and those powers may be odd or even
After squaring, all the powers will become even. Let the power be p, therefore there are p + 1 ways to put that prime
in our factor prime^(0, 1, 2, ... p). Therefore the number of factors of any number are:
(p1 + 1)(p2 + 1)... so on
But since p is always even, p + 1 is guaranteed to be odd.
Since a product of odd numbers is always odd, the number of factors is odd.

Therefore, a perfect square will always have odd number of factors.

[new concept] Jim and the Orders

This problem becomes really easy if we use a hashmap.

The Hashmap:
A data structure very much like an array, but the "index" of this array can be any data type. This "index" is called "key". The data associated with each key is called its "value".
For example:
"Key1" : 1023
"Key2" : "Hello"
12.13 : True

An advantage of hashmap is that the time required to extract a value given it's key is constant (O(1)).

In python, hashmaps are called "dictionaries".

For this problem, I created a dictionary with the keys being natural numbers, representing the position of each customer in the line (1, 2, 3, ...) and the value being the time for serving the order.

Then I sorted the dictionary by value. The keys should now be in correct order!
One way to do it in python is this:

sorted_keys = sorted(dict.items(), key = lambda kv:(kv[1], kv[0]))

This returns an array of tuples (key, value) of the dictionary sorted by value.

Just extract the keys part in a separate array, and return it.

Friday, 1 May 2020

[a delightful little problem] Fair Rations

This is a complex sounding problem that is actually very simple, and yet requires some creativity to solve.
Let's consider an array, with elements being boolean (True if the number of loaves with the person in that index in the line is even, else False). A typical starting array may look like this:

TTTTF

Now this case is interesting. We observe that in order to eliminate F entirely, we must have a situation with two nieghboring F's. Then we can give one loaf to both, and convert them to T. Hence, the total number of F's must be even, otherwise a solution is not possible, in which case we return "NO".

Now let's look at some examples cases where it is possible:

1. TFTFT

2. TTFFT

3. FTTTF

I tried some experimentation, and came up with this algorithm to distribute the loaves:

Loop through every element. If that element is T, move on.
If that element is F, then invert that element and the element to its right. Increment a "breadCount" variable by 2 everytime we do this.
Continue doing this. By the end of this, we must end up with all T's.
Return the "breadCount"

Also: Solution accepted on first try :))

[edgy question!] Lisa's Workbook

This problem is a genuinely good one. It requires unbiased and patient consideration of all the possible edge cases.

Solution:
If we write down a few examples, we sense intuitively that we need to keep track of a few things in for every chapter.

We need to keep track of:

The current page number
The number of pages required by each chapter
The question numbers on each page

The number of pages required by the i'th chapter is equal to ceil( arr[i] / k ) by observation.

I feel just putting the code would be the best idea for this one. The basic principle is while keeping track of those 3 things, we loop through all the question numbers of a given chapter, incrementing "page" whenever question number exceeds limit for each chapter, and moving to the next chapter when the number of questions in that chapter is over.

Implementation is slightly tricky so I'll just post my code for this one:

def workbook(n, k, arr):

    special = 0

    page = 1

    for i in range(len(arr)):

        finalpage = page + math.ceil(arr[i] / k)

        j = 1

        offset = 0

        while True:

            if j > k:

                page += 1

                j = j - k

                offset += k

            if page > finalpage:

                break

            if (j + offset) > arr[i]:

                break

            if (j + offset) == page:

                special += 1

            j += 1

        page = finalpage

    return special

Pretty good!
But my friend Tejas Rao made it even easier:

def workbook(n, k, arr):

    b = list()

    c= 0

    pg = 1

    for i in range((n)):

        x = arr[i]

        for j in range(1,x+1):

            print(j,pg)

            if j==pg:

                c+=1

            if j == x:

                continue

            if j%k ==0:

                pg+=1

        pg+=1

    return c

Thursday, 30 April 2020

[issue with the question] Flatland Space Stations

Here is something rare. A question where the question setter does an oopsie in the testcases.

The solution to the intended question:
First of all to make our lives easier, let's sort the array containing the indices of cities containing space stations.
Let us think of this example (0 is a city without space station, 1 is with)

00001001000001000

We should find the number of zeroes between two 1's (let that be n) and then, by induction one can prove that the longest distance to a space station in that set of zeroes is ceil( (n - 1)/2 ). We will have to find the maximum among all such groups of zeroes between two 1's.

But what about the ends? The ends have to be dealt differently. For the ends, the number of zeroes from that end to the nearest one is equal to the largest distance. So we do that by comparing the maximum value we obtained before with the number of zeroes between the left end and the first 1, and the number of zeroes after the last 1 and the right end. The maximum out of these three will be the answer.

Type, type, type... hit submit! What happens? 4 testcases did not pass. Wrong Answer, it says. But then you spend some Hackos to look at the testcase after racking your brains for an hour, and you see this:
6 6
0 1 2 4 3 5

Wait, what? The question mentions that the cities are indexed 1 based! So the question is not perfect. The setter apparently forgot to make the testcases 1 based. After taking that into account, it works.

Monday, 27 April 2020

[Bitwise OR shenanigans] ACM ICPC Team

This question is really easy, but a little hitch stopped me on this one:

Bitwise OR in common programming languages:
| operator accepts two integers, converts them to binary, ORs them, and returns the result in decimal again.
If in a | b, a and b are given in binary, we need to add a "0b" before them. This is what I did not know.

I use Python3 for competitive coding, so for reference I shall add code to compare two integers which consist only of zeroes and ones

or = int(a, 2) | int(a, 2)

or = bin(or)[2:]

the first line converts a and b to integer and the second parameter "2" tells python that a is a string of zeroes and ones which is a binary number.
Hence a = "1001" yields 9.

or is set to a | b in DECIMAL system. The second line converts it to binary.
The [2:] tells python to skip the "0b". <== this is what isn't intuitive, and frankly, kind of sucks. But OK. One more thing to keep in mind!

Super Reduced String

Here, we go for a recursive solution.

Our function basically takes an every element (except the last one) of the string, and compares it to the one next to it.

If they are equal, delete both of them from the string

We are not done yet. We have only made one pass. This may generate a new string which may be further reducible. So we introduce a boolean variable "done" initialized as True right at the beginning of the function, before any other statement.

Whenever we find two adjacent elements being equal, we set the value of done = False. At the end of the function, we check if done is False. If it is false, we set

string = superReducedString(string)

(recursion)

When the super reduced string is obtained, "done" will remain true. When done is true, return the string.

Strange Counter

After some observation by writing down a few cases, it becomes clear that
1 + 3(2ⁿ - 1) <= t
if we find n such that

n = floor(log2( (t - 1)/3 + 1 ))

1 + 3(2ⁿ - 1) is the cumulative time till latest counter.

then c = 3(2ⁿ - 1) gives us the value from which the latest counter will start counting (check this for a few cases to make it clear)

find the difference between the current time and 1 + 3(2ⁿ - 1). (to know how many steps to count down)
Subtract that difference from c and that is the answer

Sherlock and The Beast

The total number of digits (n) must be a linear multiple of 3 and 5, of form 3a + 5b = n where a is total number of 5's and b is total number of 3's. For the number to be largest possible, 'a' must be largest possible, so we can arrange 'a' 5's then remaining 'b' 3's and be done. Now, an upper bound on the number of 5's must be r= floor(n/3). the remaining digits are (b = n - r). If b % 5 == 0, then we are good! but if b % 5 != 0, then we must increment 'b' and decrement 'a' until either a == 0 without obtaining any "decent number" (as explained in question) where we return -1. If a "Decent number" exists, we shall obtain the correct distribution when both 'a' and 'b' become valid multiples of 3 and 5 respectively again. Return this number.

Halloween Sale

Algorithm:
1. Find n = floor((p - m)/d)
2. check if s - np + (n)(n - 1)(d/2) > 0
this formula comes from observation of how much money we have with us after each purchase
s, s - p, s - p + d, s - 2p + 3d, ...... , s - np + (n)(n - 1)(d/2)
    if this is > 0, that means even after the minimum price is reached, more items can be bought.
    now edge cases here are for the base cases (n = 1 and n = 0)
the n = 0 edge case is when s < p (no items can be bought)
    the n = 1 edge case is when s - p < p - d (one item can be bought)
    these edge cases have to be separately dealt with, using if statements prior to step 2
3. if the answer to step 2 is yes, then
    s = s - np + (n)(n - 1)(d/2)
n += floor(s/m)
4. if the answer to step 2 is no, then we solve the quadratic equation:
     s - np + (n)(n - 1)(d/2) = 0
     to find the number of items we can buy, and then floor the result.
     I don't know why, but taking the smaller root worked, but not always the bigger root.
5. return n