Velocity Reviews > C++ > PLS HELP w/ Interview Question!!

# PLS HELP w/ Interview Question!!

Nobody
Guest
Posts: n/a

 11-18-2006
I've been looking for a job for a while now, and have run into this
interview question twice now... and have stupidly kind of blown it twice...
(although I've gotten better)... time to finally figure this thing out...

basically the interview question is: given an unsorted listed such as:

3,1,3,7,1,2,4,4,3

find the FIRST UNIQUE number, in this case 7... and of course the list can
be millions long...

the first time I got this question, my solution was something like:

1) declare a second array A2 with n elements
2) loop through original array A1 and insert the element in its sorted
position in A2, if I find a match eliminate, mark all copies of that element
as "duplicate"
3) loop through original array A1 again and find the first number that is
not listed as a duplicate in A2

I think at the time of that interview, I pointed out that this wasnt a very
good algorithm as it would be something like O(2*(n^2)), but it will get the
job done (I did get offered that job actually, but stupidly turned it
down)...

I then offered up another solution at that interview which I thought was
better:

1) build a copy of A1 that has the original position of each element
2) sort A1
3) loop through sorted A1 deleting any duplicate items
4) now A1 will only contain unique items... so loop through it again looking
for the smallest position that was marked in step 1

this was kind of the algorithm I started out with in interview #2 (which I
really wanted, but didnt get)...as the interviewer asked me to give him the
run time of that algorithm which I said was:

step #1: n
step #2: n log n
step #3: n
step #4: n

resulting in 3n + n log n...

this was a lot better then the original algorithm, but still not too great
because of the 3n term...

I got my rejection from the 2nd place yesterday, so I kind of was up all
night trying to figure out how to get it faster...

I came up with something like this:

1) loop through A1 building a customized AVL tree... an AVL tree insertion
is O(log n), so building this tree should take O(n log n).. the customized
part is... each node inserted will bring along its original position in the
array (0..n)... if I run across a duplicate node, I will mark the position
of the node thats already in the tree as "-1" and dump the node I'm trying
to insert

so basically I've spent O(n log n) to combine steps 1, 2 & 3 of my last
solution... basically taking 2n + n log n down to O(n log n)...

2) but now I need to visit every node in the tree to find out which one has
the lowest position which is O(n) obviously ignoring any nodes I marked
as -1.

so this final "optimized" solution I came up with last night would still
require O(n + n log n).

Is there any way to optimize this further? or a better algorithm all
together? I'm thinking perhaps to keep track of the first unique node as I'm
creating the tree? but I couldn't nail down an idea that would work...

I'm sort of invisioning the tree having a member called m_nPos initialized
to 0... and as I'm building the tree, if I find a duplicate element at
m_nPos, to increment it. So basically I insert element 0, then later on, I
find a duplicate for it, so the first unique item can't possibly be at zero,
it has to be at least at position 1. But then as I did a few test cases, I
poke holes in this idea very easily.

Any ideas?

hardi.pertel@gmail.com
Guest
Posts: n/a

 11-18-2006

> basically the interview question is: given an unsorted listed such as:
>
> 3,1,3,7,1,2,4,4,3

Where there any other constraints?

Nobody
Guest
Posts: n/a

 11-18-2006

<(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
>
>
>> basically the interview question is: given an unsorted listed such as:
>>
>> 3,1,3,7,1,2,4,4,3

>
> Where there any other constraints?
>

Nope, just given an unsorted list of n numbers (n could be millions), find
the first unique number. Obviously you can't use a built in
"FindFirstUnique()" number function, the exercise was to write that
function. I thought there may be some tricks with using a database, but
thought the overhead of the database would kill any performance gains in the
algorithm.

Obviously the point is to find the first unique number in the shortest time
possible... the algorithm I described in the original post was O (n + n log
n)... I'm wondering if it can be done faster.

hardi.pertel@gmail.com
Guest
Posts: n/a

 11-18-2006
No idea for solving that. The AVL tree is unclear for me, but a
brute-force solution is simple, though time-consuming

Wayne Marsh
Guest
Posts: n/a

 11-18-2006
Nobody wrote:
> I've been looking for a job for a while now, and have run into this
> interview question twice now... and have stupidly kind of blown it twice...
> (although I've gotten better)... time to finally figure this thing out...
>
> basically the interview question is: given an unsorted listed such as:
>
> 3,1,3,7,1,2,4,4,3

Here's the first method that springs to mind for me:

Create a vector of bools initialised to false the same size as the list,
and then run through the list, setting the item in the vector at the
position of the number encountered to true, and removing ones that are
already true. Then simply look at the first element of the vector.
Bingo. One pass, right?

I bet I've made some glaring error in process or efficiency.

Wayne Marsh
Guest
Posts: n/a

 11-18-2006
Wayne Marsh wrote:
> Nobody wrote:
>> I've been looking for a job for a while now, and have run into this
>> interview question twice now... and have stupidly kind of blown it
>> twice... (although I've gotten better)... time to finally figure this
>> thing out...
>>
>> basically the interview question is: given an unsorted listed such as:
>>
>> 3,1,3,7,1,2,4,4,3

>
> Here's the first method that springs to mind for me:
>
> Create a vector of bools initialised to false the same size as the list,
> and then run through the list, setting the item in the vector at the
> position of the number encountered to true, and removing ones that are
> already true. Then simply look at the first element of the vector.
> Bingo. One pass, right?
>
> I bet I've made some glaring error in process or efficiency.

This doesn't even make sense at all.

Sorry.

Frederick Gotham
Guest
Posts: n/a

 11-18-2006
Nobody:

> given an unsorted listed such as:
>
> 3,1,3,7,1,2,4,4,3
>
> find the FIRST UNIQUE number, in this case 7... and of course the list
> can be millions long...

I've only recently begun reading a book on the C++ Standard Library, so I'm
not very familiar with all the built-in iterator routines and so forth, so
I can't provide an optimal solution.

However, I could give you a inefficient solution that gets the job done by
simply putting thought into it:

(This code won't be Grade A because I'm only starting to get the hang of
iterators and so forth. Expect mistakes.)

#include <cstddef>

template<class Iterator,class T>
std::size_t QuantOccur(Iterator begin,Iterator const end,T const &obj)
{
size_t quant = 0;

while (end!=begin) if (*begin++==obj) ++quant;

return quant;
}

template<class Iterator>
Iterator FindFirstUnique(Iterator begin,Iterator const end)
{
for (;end!=begin;++begin) if (1==QuantOccur(begin,end,*begin)) break;

return begin;
}

#include <iostream>
#include <ostream>

int main()
{
int arr[] = {3,1,3,7,1,2,4,4,3};

int i = *FindFirstUnique(arr,arr+sizeof arr/sizeof*arr);

std::cout << i << std::endl;
}

For all I know, there could be a Standard Library facility that does this
with a single line of code!

--

Frederick Gotham

Frederick Gotham
Guest
Posts: n/a

 11-18-2006
Frederick Gotham:

> size_t quant = 0;

Once again my retard compiler failed to spot the error.

Nobody
Guest
Posts: n/a

 11-18-2006

<(E-Mail Removed)> wrote in message
news:(E-Mail Removed) oups.com...
> No idea for solving that. The AVL tree is unclear for me, but a
> brute-force solution is simple, though time-consuming
>

Hehe... that was kind of the point of my question. In the interview, I got
it down to O(3n + n log n), later on at home I got it down to O(n + n log
n). I'm trying to see if there is a faster way.

Nobody
Guest
Posts: n/a

 11-18-2006

"Wayne Marsh" <(E-Mail Removed)> wrote in message
news:455f6b31\$0\$2440\$(E-Mail Removed)...
> Wayne Marsh wrote:
>> Nobody wrote:
>>> I've been looking for a job for a while now, and have run into this
>>> interview question twice now... and have stupidly kind of blown it
>>> twice... (although I've gotten better)... time to finally figure this
>>> thing out...
>>>
>>> basically the interview question is: given an unsorted listed such as:
>>>
>>> 3,1,3,7,1,2,4,4,3

>>
>> Here's the first method that springs to mind for me:
>>
>> Create a vector of bools initialised to false the same size as the list,
>> and then run through the list, setting the item in the vector at the
>> position of the number encountered to true, and removing ones that are
>> already true. Then simply look at the first element of the vector. Bingo.
>> One pass, right?
>>
>> I bet I've made some glaring error in process or efficiency.

>
> This doesn't even make sense at all.
>
> Sorry.

Probably not... you'd have to be able to map the bools to the items and the
position which would start getting to the order of my original first pass
2*(n^2) algorithm.

Same thing with a hash table, you'd need the original order somehow and you
can't really remove the first and second item, because then when a 3rd item
comes along, you wont know its a duplicate.