Feedback hw03

Let me first start to say that I was generally impressed by the quality of the reports, the quality of the code, and the work you did for hw03. I don’t think you were given an easy task, and you all delivered (well, almost) code that ran out-of-the-box and generated datasets as it what was asked, often with very good quality.

All of you managed with GitHub it seems, and all submissions were on time with the right structure.

I learned many things reading some reports, which is a testament to the quality of what you did 👍

The marks reflect this: the avg is 79% and the median 81%.

How did I mark hw03?

I read the report and commented on it as I read, trying to understand what you did and how you modified the algorithms in the book so that they work with the BK-City dataset. I wrote comments in the PDF directly; I’ll send them to you.

Then I visualised the PLY files you put in /data/ and looked at artefacts and if they aligned with the figures you had in the report.

Then I took a subset of BK (not the one I gave you, another part) and tried your code with the params.json you provided (or the parameters you mentioned in the report), and looked at the general quality of what I obtained. Virtually all the code ran directly with modifications, which was nice. Some of the code was optimised to a level that I didn’t know was possible with Python (I am not a master of NumPy…).

Marking the report exactly with a rubric turned out, as expected, to be rather difficult. Nothing falls directly into a category, so I compared generally between groups and tried my best to balance them and give a fair mark.

Below is the breakdown of how I marked, a refinement of what was in the description of hw03. The letters give you a description of the report in Markdown I’ll also send to you.

Report:

D (2pt): description/clarity of the implementation
PC (1pts): pros/cons of each: based just on generalities or on experience?
A (1pt): aware of drawbacks of the methods? Discussion and insights about them?

Each of the 3 algos:

R (0.5pt): runs on subset? [almost all code ran, so most got full marks for this]
G (1.0pt): how good are results? [I compared between teams]
E (0.5pt): efforts made to improve those? [if you made a good effort to improve issues you got your marks]

Some feedback in no specific order

When writing a report, keep in mind that you’re not writing a linear story of what you did in the order you worked… It’s not a novel where I discovered at the end that it was Colonel Mustard who is the killer (and with a gun). The best way to present what you did is to start with the results first, and then explain the details. Then I have an overview and I can understand more and more at each paragraph what you did. Writing a linear story is from your perspective, but the reader couldn’t care about the mistakes you did and why… they want to know what are the final results. You can still mention that some things didn’t work in the end, but don’t start with this!

One team rewrote the pseudo-code of the algos in the book and added–in red–the parts they modified. This is simple and yet brilliant, and very clear. In 1min I knew exactly what they did.

Too often the report was clearly written by 3-4 different persons and no link between the sections were visible… For instance, one says it’s fast and the other one says it’s very fast. Why not report on the time used by each? Also terminology differed greatly between section sometimes.

40% of the final marks was for the report, you should consider this in the future (this comment doesn’t apply to all here, many reports were outstanding).

Randomness can be removed for debugging an algorithm by using a seed: random.seed(42) to get always the same! I assumed that you knew this, but I realised that many didn’t and suffered unnecessarily, sorry about this.

Pros/cons: some were very generic and seem taken from your friend ChatGPT and/or the Internet… I wanted your pros/cons based on your datasets and your implementation; I wanted your insights based on your experience. Eg filtering noise was mentioned often, but BK-City is pretty noise-free, so not an massive issue for this assignment. If you think there was noise and algo X is good to handle it: show it to me with a figure!

RANSAC: the “stripes” were the main issue, and the fact that disconnected planes could be detected as the same was the other main issue in my opinion. The first was solvable, as many of you found out, by considering the normal of the points and discarding them if too far from that of the plane. The disconnected points? In post-processing by using a clustering algorithm (by first projecting to 2D and performing it in 2D) or using alpha-shapes (see what CGAL, always a reference, does about this: https://doc.cgal.org/latest/Shape_detection/index.html#title1). It was often not clear if you used DBSCAN in 2D or 3D, I guess in the end 3D was okay since all on a plane but still I would have liked to read about this.

RG: this was the “easiest” to implement, because out-of-the-box it gives pretty good results. Most of you stated this, and this is thus our winner as Best Shape Detection for BK-City®. How to deal with edges being labelled as no plane/gaps was the biggest issue you faced. You solved it by post-processing and assigning to the plane that was most present in the 15-nn for instance, or many tried to minimise the gaps by relaxing the angle threshold. I spoke to the developer of the 3DBAG (you’ll have a lecture about this during GEO1004) and he said that perhaps he would like to have the points in the gap labelled in both/all planes incident to the gap. This would be for his use useful (3D reconstuction of buildings). Food for thoughts…

HT: Yes–as expected–it’s a huge mess, especially for large buildings… I am glad that you enjoyed trying to fix that mess. A lot of you had different solutions, and sometimes those solutions meant that you were more implementing RANSAC than HT, but in general I enjoyed the creativity. The 2 best solutions were probably:

let’s start with specific orientation of planes that we know are present in buildings (eg verticals and horizontal only) and then continue with other cases but the number of points have been reduced so it’s better.
find the orientation of important planes (by calculating the planarity features) and focus first on those, and perhaps only on those.

I wasn’t joking when I said that I had never implementing it; now I know that it’s not worth it in real-life :) (except for some specific problems)