New exercise about bitwise operations

A few days ago Erik encouraged us to add some new exercises.
Here’s an idea for an exercise about bitwise operations:

Integers can represent “bit sets” where each set/unset bit represents the presence/absence of a specific element.
Two integers (bit sets) bs1 and bs2 are “disjoint” if they do not have a common set bit, i.e. if bs1 AND bs2 is 0.
A list of integers is “disjoint” if all integers in the list are pairwise disjoint.

For example:

  • [1, 36, 18] is a disjoint list because the first bit (0b00_0001) is only set in the first element, the third and sixth bit (0b00_0100 and 0b10_0000) are only set in the second element, and the second and fifth bit (0b00_0010 and 0b01_0000) are only set in the third element.
  • [10, 12] on the other hand is not a disjoint list because the fourth bit (0b1000) is set in both elements.

Given a list of non-negative integers, find the longest continuous disjoint sublist.

For example:

  • In the list [222, 189, 148, 32, 144, 2, 0, 76, 26, 62] the longest continuous disjoint sublist is [32, 144, 2, 0, 76]
  • In the list [194, 198, 49, 1, 0, 106, 221, 194] the longest continuous disjoint sublist is [1, 0, 106]

This exercise can be solved in multiple ways:

  • with brute force, by testing all sublists.
  • greedily, by looping over the list and for each potential start of the sublist adding elements until a duplicate bit is found.
  • with a binary search, by “guessing” the length, trying to find a disjoint sublists of that length, and adjusting the bounds of the binary search accordingly.
  • with a sliding window, by looping over the list, narrowing the window on the left until the next element at the right can be included.

This fourth approach is the reason for this proposal, because it uses a combination of three bitwise operations: AND (to check if the window can be widened), OR (to actually widen the window and include an element), and XOR (to narrow the window and exclude an element).

This exercise is more on the algorithmic side but I think that’s just the consequence of aiming for a specific technique (bitwise operations). I’d say its difficulty is medium to medium-hard.


Open question: Do we want the exact sublist? And if multiple longest disjoint sublists exist, which one? Or do we just want the length of the longest disjoint sublist?

My biggest problem is coming up with a story that does not sound too artificial.
It needs a sequence of entities (people, objects, events, …), each with a subset of binary attributes.
It should make sense that we want to avoid duplicates and that we want a continuous sublist.

Here’s a braindump of ideas:

  • The daily meal in a cantina is made with a subset of ingredients (e.g. brokkoli and cheese on Monday, chicken and corn on Tuesday, pasta and chesse on Wednesday). Find the longest continuous span of days without a duplicate ingredient.
  • A gym offers daily course that train specific muscles or abilities (e.g. legs and endurance on Wednesday, flexibility on Friday, biceps and strength on Saturday). Find the longest continuous span of days with courses that do not train a muscle or ability twice.
  • You are part of a team of specialists with unique abilities. There’s a sequence of events with different requirements (e.g. Ben and Karen in week 1, Alexandra and Andrea in week 2, Eric and Aman in Week 3, Alexandra and Eric in Week 4). Find the longest interval that does not require one team member twice.
  • You’re teaching a class of high school students who are sick or absent from time to time. You store these absences in bitsets, one bitset per day, where each bit represents whether a specific student has been absent. You want to know the longest period in days in which no student has missed class twice or more.

Edit: Fixed “non-positive”. Thank, Isaac
Edit 2: added another possible story

2 Likes

I like the disjoint part. The sublist part seems like it might have some overlap with the existing sublist exercise; it that an issue?

What does “non-positive” mean there?

The sublist part seems like it might have some overlap with the existing sublist exercise

In my experience the most common terms in similar contexts are “subsequence” (for an in-order selection of elements that are not necessarily contiguous) and “subarray” (for an in-order selection of contiguous elements).
Exercism does use the term “sublist” in one exercise (which is equivalent to “subarray”), so I chose to use that.

Otherwise this exercise doesn’t have much in common with the sublist exercise.

What does “non-positive” mean there?

Not negative, i.e. 0 or positive.

“non-positive” includes positive? Do you mean non-negative?

1 Like

Oh, you’re right. I will fix that in my post. Thank you!

1 Like

I’ll cross-post this from the exercise @tasx proposed…

When I’m evaluating exercises for myself, my main starting point is to write the code and then judge how easy/hard/elegant/interesting/etc the solutions can get. I then tend to backfill everything else based on that. Without the code, I feel like it’s really hard for me to evaluate how “good” at exercises is on the educational metrics.

So I’d love to see a solution (ideally in a high-level language - e.g. JS, Python, Ruby, C#, your QBasic one, etc) in order to get a feel for what this would look like. Ideally whatever solution you feel really shows off why it’s interesting etc.

2 Likes

I’ve written a Python version of the exercise, stripped down to the core, without any story.
First a stub with some tests for anybody who wants to try to solve it by themselves:

#! /usr/bin/env python3

import unittest


def longest_disjoint_sublist(bitsets: list[int]) -> int:
    """Determine the length of the longest disjoint sublist."""
    # TODO: your solution here
    # Happy coding!


class LongestDisjointSublistTest(unittest.TestCase):
    """Test the function `longest_disjoint_sublist`."""

    def test_empty_list(self) -> None:
        """The longest sublist of an empty list is empty."""
        self.assertEqual(longest_disjoint_sublist([]), 0)

    def test_single_element(self) -> None:
        """Any list of length 1 is disjoint."""
        self.assertEqual(longest_disjoint_sublist([23]), 1)

    def test_two_equal_non_empty_bitsets(self) -> None:
        """Two equal non-empty bitsets are not disjoint."""
        self.assertEqual(longest_disjoint_sublist([10, 10]), 1)

    def test_two_bitsets_with_one_common_bit(self) -> None:
        """Two bitsets with a common set bit are not disjoint."""
        self.assertEqual(longest_disjoint_sublist([39, 28]), 1)

    def test_two_bitsets_without_common_bit(self) -> None:
        """Two bitsets without a common set bit are disjoint."""
        self.assertEqual(longest_disjoint_sublist([21, 42]), 2)

    def test_all_bitsets_disjoint(self) -> None:
        """If all bitsets have unique set bits the whole list is disjoint."""
        self.assertEqual(longest_disjoint_sublist([1, 32, 2, 16, 4, 8]), 6)

    def test_longer_list_with_short_disjoint_sublists(self) -> None:
        """A list with multiple disjoint sublist of length 2."""
        bitsets = [
                854, 798, 768, 32, 733, 416, 342, 447, 302, 382, 968, 739, 51,
                934, 852, 0, 100, 1020, 0, 76, 373, 83, 0, 1023, 1023, 256,
                500, 511, 192, 582]
        # all distjoint sublists of length 2:
        # [768, 32]
        # [32, 733]
        # [852, 0]
        # [0, 100]
        # [1020, 0]
        # [0, 76]
        # [83, 0]
        # [0, 1023]
        self.assertEqual(longest_disjoint_sublist(bitsets), 2)

    def test_longer_list_with_shorter_disjoint_sublist(self) -> None:
        """A list with one disjoint sublist of length 4."""
        bitsets = [
                8, 272, 329, 8, 86, 828, 80, 16, 202, 10, 8, 40, 428, 8, 32,
                1, 317, 1, 16, 0, 973, 1, 4, 2, 769, 120, 64, 321, 136, 881]
        # longest disjoint sublist: [4, 2, 769, 120]
        self.assertEqual(longest_disjoint_sublist(bitsets), 4)

    def test_longer_list_with_long_disjoint_sublist(self) -> None:
        """A list with one disjoint sublist of length 8."""
        bitsets = [
                256, 256, 9, 72, 16, 80, 4, 128, 64, 16, 128, 1, 128, 32,
                0, 0, 64, 16, 6, 64, 16, 1, 128, 64, 132, 512, 1, 4, 16, 2]
        # longest disjoint sublist: [1, 128, 32, 0, 0, 64, 16, 6]
        self.assertEqual(longest_disjoint_sublist(bitsets), 8)

    def test_long_list_with_long_disjoint_sublist(self) -> None:
        """A list with one disjoint sublist of length 7."""
        bitsets = [
                1, 538, 63, 510, 1023, 0, 0, 64, 718, 900, 989, 635, 128, 828,
                4, 0, 0, 167, 256, 570, 895, 0, 417, 1022, 788, 128, 0, 0, 33,
                578, 0, 8, 1023, 144, 33, 386, 608, 1015, 260, 2, 0, 959, 0,
                144, 0, 288, 767, 292, 64, 1023, 0, 1022, 991, 1018, 310, 368,
                733, 959, 356, 764, 269, 507, 1021, 834, 394, 0, 503, 940, 991,
                1022, 5, 352, 1007, 336, 666, 667, 512, 97, 19, 828, 767, 641,
                517, 592, 640, 364, 1023, 4, 285, 229, 281, 511, 1006, 767, 0,
                503, 678, 141, 195, 128, 14, 283, 951, 12, 0, 447, 1021, 575,
                0, 0, 0, 962, 8, 763, 0, 385, 0, 1023, 540, 768, 979, 242, 773,
                782, 621, 4, 1019, 751, 409, 430, 895, 274, 146, 105, 414, 596,
                8, 0, 495, 336, 421, 727, 1023, 1, 81, 987, 971, 991, 215,
                1019, 256, 256, 745, 32, 550, 849, 1023, 332, 0, 950, 521, 119,
                0, 877, 1023, 4, 1023, 0, 313, 957, 605, 0, 987, 66, 36, 367,
                32, 1023, 693, 32, 791, 991, 885, 445, 0, 917, 500, 1007, 703,
                8, 1023, 580, 1, 24, 731, 514, 468, 284, 918, 72, 271, 102,
                762, 817, 635, 425, 518, 988, 331, 235, 0, 383, 859, 1023, 858,
                2, 36, 64, 1023, 0, 0, 509, 289, 503, 969, 1015, 0, 609, 1022,
                991, 1, 581, 478, 312, 100, 0, 959, 128, 502, 144, 192, 3, 767,
                117, 1, 255, 324, 206, 629, 735, 511, 1023, 6, 201, 580, 774,
                832, 579, 657, 392, 16, 821, 726, 949, 4, 667, 660, 386, 999,
                959, 26, 711, 186, 518, 0, 147, 1007, 1023, 225, 133, 937, 437,
                305, 35, 767, 937, 973, 511, 299, 10, 1021, 515, 766, 132, 782,
                72, 957, 383, 0, 516]
        # longest disjoint sublist: [128, 0, 0, 33, 578, 0, 8]
        self.assertEqual(longest_disjoint_sublist(bitsets), 7)


if __name__ == '__main__':
    unittest.main()

Here’s my implementation of the solution:

def longest_disjoint_sublist(bitsets: list[int]) -> int:
    """Determine the length of the longest disjoint sublist."""
    max_len = or_sum = first_idx = 0
    for last_idx, bs in enumerate(bitsets):
        while or_sum & bs:
            or_sum ^= bitsets[first_idx]
            first_idx += 1
        or_sum |= bs
        max_len = max(max_len, last_idx - first_idx + 1)
    return max_len