Venice2020 Building Heights Detection: Difference between revisions

From FDHwiki
Jump to navigation Jump to search
 
(447 intermediate revisions by 3 users not shown)
Line 1: Line 1:
= Introduction =
= Introduction =
In this project, our main goal is to obtain the height information of buildings in Venice city. In order to achieve this goal, we construct a point cloud model of Venice from Google Earth and YouTube drone videos with the help of photogrammetry tools. Initially, we experimented on drone videos from YouTube. Since our goal is to detect the height of all the buildings in Venice, it is important for us to collect images of all the buildings in the city. However, in YouTube videos, we only found some landmark architecture whose height information is already available on the Internet. Hence, we switched the data scource to Google Earth images. With our current image source, we have successfully calculated the heights of Venice buildings including unknown buildings and famous ones. We also evaluated both subjective and objective quality of our City Elevation Map, which will be discussed in the following sections. The Venice 3d model along with height information can be used to create an virtual experience. Besides, the City Elevation map can be useful for urban planning and security purposes in the future.


Venice, one of the most extraordinary cities in the world, was built on 118 islands in the middle of the Venetian Lagoon at the head of the Adriatic Sea in Northern Italy. The planning of the city, to be floating in a lagoon of water, reeds, and marshland always amazes travelers and architects. The beginning of the long and rich history of Venice was on March 25th 421 AD (At high noon). Initially, the city was made using mud and woods. The revolution of the city plan has changed over time since it was built. Today, "the floating city of Italy" is called "the sinking city". Over the past 100 years, the city has sunk 9 inches. According to environmentalists, because of global warming, the sea level will rise, which will eventually cover this beautiful city in upcoming years. To us as technologists and engineers, it is important to keep track of the building height changes of Venice in order to take relevant actions on time to save the beautiful city. Since changes in natural disasters are sometimes unpredictable and out of control, to save the present beauty of Venice, we can recreate the Venice visiting experience virtually as a part of [https://en.wikipedia.org/wiki/Venice_Time_Machine Venice Time Machine]. This in the future will be relevant for travelers, historians, city planners, and other enthusiasts in the world.  
= Motivation =
 
Venice, one of the most extraordinary cities in the world, was built on 118 islands in the middle of the Venetian Lagoon at the head of the Adriatic Sea in Northern Italy. The planning of the city, to be floating in a lagoon of water, reeds, and marshland always amazes travellers and architects. The beginning of the long and rich history of Venice was on March 25th 421 AD (At high noon). Initially, the city was made using mud and woods. The revolution of the city plan has changed over time since it was built. Today, "the floating city of Italy" is called "the sinking city". Over the past 100 years, the city has sunk 9 inches. According to environmentalists, because of global warming, the sea level will rise, which will eventually cover this beautiful city in upcoming years. To us engineers, it is important to keep track of the building height changes of Venice in order to take relevant actions in time to save the beautiful city. Since changes in natural disasters are sometimes unpredictable, to save the present beauty of Venice, we can recreate the Venice visiting experience virtually as a part of [https://en.wikipedia.org/wiki/Venice_Time_Machine Venice Time Machine]. This in the future will be relevant for travellers, historians, architects, and other enthusiasts in the world. Apart from the facts mentioned above, building height is one of the most important pieces of information to consider for urban planning, economic analysis and digital twin implementation.
 
= Milestones =


Since the grant floating city is continuously sinking, it is important to keep track of the height changes of every building.  In this project, our main goal is to obtain the height information of buildings in Venice. In order to achieve this goal, we construct a point cloud model of Venice from Google Earth with the help of a photogrammetry tool developed at  [https://www.epfl.ch/labs/dhlab/ DHLAB]. Initially, we have experimented with drone videos from youtube for our reference data. Since we aim to detect the height of all the buildings of Venice, it was important for us to get image data of all the buildings in the city. However, in Youtube Videos, we found only the famous building data whose heights and detail are already available over the internet. Hence, we changed our source of information to Google Earth from Youtube`s drone-captured videos. With our current data, we have successfully calculated the heights of all the buildings of Venice including general buildings and famous ones. Detail of our method we have described in the following sections. We are hopeful that this information will be relevant to contribute to the [https://en.wikipedia.org/wiki/Venice_Time_Machine Venice Time Machine] development project.


= Motivation =
=== Milestone 1 ===
 
* Get familiar with [https://github.com/openMVG/openMVG OpenMVG], [http://www.open3d.org/ Open3D], [https://pyntcloud.readthedocs.io/en/latest/ pyntcloud], [https://www.agisoft.com/ Agisoft Metashape], [https://www.blender.org/ Blender], [https://www.cloudcompare.org/main.html CloudCompare] and [https://www.qgis.org/en/site/ QGIS]


* Collect high resolution venice drone videos on youtube and Google Earth images as supplementary materials


Building height is one of the most important information to consider for urban planning, economic analysis, digital twin implementation, and so on. 3D models and height information are also useful to create a realistic virtual space of an area, such as creating a realistic experience of visiting Venice in Virtual Reality.
=== Milestone 2 ===


Drones and Google Maps provide 3D views of buildings and surroundings in detail which are useful resources to keep track of changes in a target place. This information can be used to understand current details of the city and also will be useful in the future as historical data to compare with.
* Align photos of Venice, generate sparse point clouds made up of only high-quality tie points and repeatedly optimize point cloud models by reconstruction uncertainty filtering, projection accuracy filtering and reprojection error filtering
* Remove outliers automatically and manually, and build dense point clouds based on sparse cloud
* Build Venice 3D model (mesh) and tiled model according to dense point cloud data


In this project, we aim to detect the building heights of Venice. Many geographical areas, which are either not famous or not accessible easily are accessible by Google Earth 3D views. We aim to make a 3D model of Venice with every detail of the city and calculate point clouds to detect the heights of the city.
=== Milestone 3 ===


These models and building height information can be used to create an experience of visiting Venice virtually. Besides, this virtual experience can also be used in the future as a part of Time Machine in order to revisit the space in virtual reality as if going back to time. Besides, we can also use these data for urban planning and security purposes in the future. In another word, the height information of every building in a famous busy city Venice has an impact in different realms.
* Generate ground plane with the largest support in the point cloud and redress the plane by translation and rotation
* Construct City Elevation Model by a point-to-plane method and generate City Elevation Model based on z-coordinate after redressing
 
=== Milestone 4 ===
 
* Align the City Elevation Map to the reference cadaster using QGIS's georeferencing and evaluate the subjective visual quality of the map
* Evaluate the accuracy of height calculation of both Google Earth based Venice model and Youtube based Venice model


= Methodology =
= Methodology =
Line 20: Line 35:
== Images Aquisition ==
== Images Aquisition ==


[[File:a.png|thumb|upright=1.5|right|Figure 1: YouTube Video]]
[[File:a.png|x250px|thumb|upright=1.5|right|Figure 1: YouTube Video]]


Point cloud denotes a 3D content representation which is commonly preferred due to its high efficiency and relatively low complexity in the acquisition, storage, and rendering of 3D models. In order to generate Venice dense point cloud models, sequential images from different angles in different locations need to be collected first based on the following two sources.
Point cloud denotes a 3D content representation which is commonly preferred due to its high efficiency and relatively low complexity in the acquisition, storage, and rendering of 3D models. In order to generate Venice dense point cloud models, sequential images from different angles in different locations need to be collected first based on the following two sources.


[[File:b.png|thumb|upright=1.5|right|Figure 2: Google Earth]]
[[File:b.png|x250px|thumb|upright=1.5|right|Figure 2: Google Earth]]


=== Youtube Video Method ===
=== Youtube Video Method ===
Line 32: Line 47:
=== Google Earth Method ===
=== Google Earth Method ===


In order to access the comprehensive area of Venice, we decided to use Google Earth as a complementary tool: we could attain images of every building we want at each angle. With Google Earth, we are able to have much more access to aerial images of Venice, not only will these images serve as images set to generate dense point cloud models in order to calculate building heights, but we can also use the overlapping part in different point cloud models to evaluate point cloud models in a cross-comparison way.
In order to access the comprehensive area of Venice, we decided to use Google Earth as a complementary tool: we could obtain images of every building we want at each angle. With Google Earth, we are able to have much more access to aerial images of Venice, not only will these images serve as images set to generate dense point cloud models in order to calculate building heights, but we can also use the overlapping part in different point cloud models to evaluate point cloud models in a cross-comparison way.


In a nutshell, we apply a photogrammetric reconstruction tool to generate point cloud models. Photogrammetry is the art and science of extracting 3D information from photographs. As for the Youtube-based method, we will only target famous buildings which appear in different drone videos and select suitable images to form the specific building's image set. With regard to the Google Earth method, we will obtain images of the whole of Venice manually in a scientific trajectory of the photogrammetry route.
In a nutshell, we apply a photogrammetric reconstruction tool to generate point cloud models. Photogrammetry is the art and science of extracting 3D information from photographs. As for the Youtube-based method, we will only target famous buildings which appear in different drone videos and select suitable images to form the specific building's image set. With regard to the Google Earth method, we will obtain images of the whole of Venice manually in a scientific trajectory of the photogrammetry route.


== Point Cloud Processing ==
== Point Cloud Generation ==
Sparse cloud is generated after aligning the multi-angle photos, which can observe overlapping photo pairs. Then the dense point cloud uses the estimated camera positions generated during tie points based matching and the depth map for each camera, which can have greater density than LiDAR point clouds.[1]
 
== Point Cloud Optimization ==
=== Outlier Removal ===
=== Outlier Removal ===
When collecting data from scanning devices, the resulting point cloud tends to contain noise and artifacts that one would like to remove. Apart from selecting points in Agisoft Metashape manually, we also use Open3D to deal with the noises and artefacts. In the Open3D, there are two ways to detect the outliers and remove them: '''statistical outlier removal''' and '''radius outlier removal'''. Radius outlier removal removes points that have few neighbors in a given sphere around them, and statistical outlier removal removes points that are further away from their neighbors compared to the average for the point cloud. We set the number of the neighbors to be 50, radius to be 0.01. Automatic outlier removal could remove around 75% percent of noise, but the effect isn't perfect, which is not as good as the manually gradual selection. Therefore, Automatic outlier removal could serve as a prepossessing step for the final de-noise, after automatic outlier removal, we still need to manually de-noise the point cloud.
=== Error Reduction ===
=== Error Reduction ===
[[File:methodology.png|x195px|thumb|upright=1.5|right|Figure 3: YouTube Video]]
'''The first phase''' of error reduction is to remove points that are the consequence of poor camera geometry. Reconstruction uncertainty is a numerical representation of the uncertainty in the position of a tie point based on the geometric relationship of the cameras from which that point was projected or triangulated, which is the ratio between the largest and smallest semi axis of the error ellipse created when triangulating 3D point coordinates between two images. Points constructed from image locations that are too close to each other have a low base-to-height ratio and high uncertainty. Removing these points does not affect the accuracy of optimization but reduces the noise in the point cloud and prevents points with large uncertainty in the z-axis from influencing other points with good geometry or being incorrectly removed in the reprojection error step. A reconstruction uncertainty of 10, which can be selected with the gradual selection filter and set to this value or level, is roughly equivalent to a good base-to-height ratio of 1:2.3 , whereas 15 is roughly equivalent to a marginally acceptable base-to-height ratio of 1:5.5. The reconstruction uncertainty selection procedures are repeated two times to reduce the reconstruction uncertainty toward 10 without having to delete more than 50 percent of the tie points each time.[1]
'''The second phase''' of error reduction phase removes points based on projection accuracy. Projection accuracy is essentially a representation of the precision with which the tie point can be known given the size of the key points that intersect to create it. The key point size is the standard deviation value of the Gaussian blur at the scale at which the key point was found. The smaller the mean key point value, the smaller the standard deviation and the more precisely located the key point is in the image. The highest accuracy points are assigned to level 1 and are weighted based on the relative size of the pixels. A tie point assigned to level 2 has twice as much projection inaccuracy as level 1. The projection accuracy selection procedure is repeated without having to delete more than 50% of the points each time until level 2 is reached and there are few points selected. [1]
'''The final phase''' of error reduction phase is removing points based on reprojection error. This is a measure of the error between a 3D point's original location on the image and the location of the point when it is projected back to each image used to estimate its position. Error-values are normalized based on key point size. High reprojection error usually indicates poor localization accuracy of the corresponding point projections at the point matching step. Reprojection error can be reduced by iteratively selecting and deleting points, then optimizing until the unweighted RMS reprojection error is between 0.13 and 0.18 which means that there is 95% confidence that the remaining points meet the estimated error.[1]
== Plane Generation ==
[[File:ground.png|thumb|400px|upright=1.5|right|Figure 4: Plane Generation]]
To calculate the height of the building, we have to find the ground of the Venice PointCloud model, which could be retrieved by a tool provided by Open3D. Open3D supports segmentation of geometric primitives from point clouds using "Random Sample Consensus", which is an iterative method to estimate parameters of a mathematical model from a set of observed data that contains outliers when outliers are to be accorded no influence on the values of the estimates. To find the ground plane with the largest support in the point cloud, we use segment_plane by specifying "distance_threshold". The support of a hypothesised plane is the set of all 3D points whose distance from that plane is at most some threshold (e.g. 10 cm). After each run of the RANSAC algorithm, the plane that had the largest support is chosen. All the points supporting that plane may be used to refine the plane equation, which is more robust than just using the neighbouring points by performing PCA or linear regression on the support set.
== Plane Redressing ==


The first phase of error reduction is to remove points that are a result of poor camera geometry. Reconstruction uncertainty is a numerical representation of the uncertainty in the position of a tie point based on the geometric relationship of the cameras from which that point was projected or triangulated, considering geometry and redundancy. Reconstruction uncertainty can also be thought of as the ratio between the largest and smallest semiaxis of the error ellipse created when triangulating 3D point coordinates between two images. Points constructed from image locations that are too close to each other have a low base-to-height ratio and high uncertainty. Removal of these points does not affect the accuracy of optimization but reduces the noise in the point cloud and prevents points with large uncertainty in the z-axis from influencing other points with good geometry or being incorrectly removed in the reprojection error step. A reconstruction uncertainty of 10, which can be selected with the gradual selection filter and set to this value or level, is roughly equivalent to a good base-to-height ratio of 1:2.3 (parallax angle of about 23 degrees [°]), whereas 15 is roughly equivalent to a marginally acceptable base-to-height ratio of 1:5.5 (parallax angle of about 10°). Previous guidance directs the reconstruction uncertainty selection procedure to be repeated two times to reduce the reconstruction uncertainty toward 10 without having to delete more than 50 percent of the tie points each time. If a reconstruction uncertainty of 10 is reached in the first selection attempt and less than 50 percent of the tie points are selected, a single optimization after deleting points is sufficient.  
After selecting the most convenient plane and its supporting points, we define the plane equation in the 3D space. We want to build a PointCloud model where the ''Z'' coordinate of a point equals the height of this point in real world, so the next step is to redress the plane by projecting it on an ''X-O-Y'' space. For the projection calculation, we used the projection matrix mentioned below.


The second error reduction phase removes points based on projection accuracy, which is a measure of the “Mean key point size”; the value is accessed from the chunk information dialog. The key point size (in pixels) is the standard deviation value (in σ) of the Gaussian blur at the scale at which the key point was found; lower standard deviation values are more precisely located in space. Thus, the smaller the mean key point value, the smaller the standard deviation and the more precisely located the key point is in the image. Projection accuracy is essentially a representation of the precision with which the tie point can be known given the size of the key points that intersect to create it. Metashape saves an internal accuracy and scale value for each tie point as part of the correlation process. The highest accuracy points are assigned to level 1 and are weighted based on the relative size of the pixels. A tie point assigned to level 2 has twice as much projection inaccuracy as level 1. Not all projects can tolerate gradual selection and removing points at a level of 2 to 3, particularly if the images have undergone compression or are of lower quality from noise or blur. A gradual selection level of 5 or 6 may be the best that can be obtained. Previous guidance directs the projection accuracy selection procedure is to be repeated without having to delete more than 50 percent of the points each time until level 2 is reached, and at this level, there are few to no points selected. Here we clarify that after level 3 is reached, repeated selections have diminishing returns and the added optimizations may overfit the camera.
Here, the plane equation is ''ax + by + cz + d = 0''
The redressing was performed in two steps, first translation and then rotation.  


The final error reduction phase is removing points based on reprojection error. This is a measure of the error between a 3D point’s original location on the image and the location of the point when it is projected back to each image used to estimate its position. Error-values are normalized based on key point size (in pixels). High reprojection error usually indicates poor localization accuracy of the corresponding point projections at the point matching step. Reprojection error can be reduced by iteratively selecting and deleting points, then optimizing until the unweighted RMS reprojection error (in pixels) is between 0.13 and 0.18 with additional optimization coefficients checked and with the use of the “Fit additional corrections”. The weighted RMS reprojection error is calculated such that not all points selected by targeting a specific reprojection error level will remove the same level of estimated error. When a value of 0.13 to 0.18 unweighted RMS reprojection error (in pixels) is reached, there is 95% confidence that the remaining points meet the estimated error. Most projects should be sufficiently optimized at a reprojection error of 0.3 pixels, and there should be no noticeable differences in products with additional optimization. However, high-quality images may benefit from more rigorous error reduction to the lower levels and produce a better camera calibration. At this step, the tie point accuracy (in pixels) could also be revisited (step 8), while constantly monitoring the camera and marker total errors to assist in the decision to stop the error reduction and optimization process.
For translation, the plane intersects ''Z''-axis at ''(0, 0, -d/c)''. So the translation is [[File:C.png|200px|center|]] which passes through the origin and vector ''v = ''(a, b, c)''<sup>T</sup>''


== Plane Generation ==
For the rotation, the angle between ''v and k = ''(0, 0, 1)''<sup>T</sup>''' is given by:
[[File:ground.png|thumb|upright=1.5|right|Figure 3: Plane Generation]]


[[File:D.png|200px|center|]]




To calculate the height of the building, we have to find the ground of the PointCloud model, which could be retrieved by a tool provided by Open3D. Open3D supports segmentation of geometric primitives from point clouds using 'Random Sample Consensus', which is an iterative method to estimate parameters of a mathematical model from a set of observed data that contains outliers when outliers are to be accorded no influence on the values of the estimates. To find the ground plane with the largest support in the point cloud, we use segment_plane by specifying 'distance_threshold'. The support of a hypothesized plane is the set of all 3D points whose distance from that plane is at most some threshold (e.g. 10 cm). After each run of the RANSAC algorithm, the plane that had the largest support is chosen. All the points supporting that plane may be used to refine the plane equation, which is more robust than just using the neighboring points by performing PCA or linear regression on the support set.
The axis of rotation has to be orthogonal to v and k. so its versor is:


== Plane Redress ==
[[File:E.png|350px|center|]]


After selecting the most convenient plane and its supporting points, we define the plane equation in the 3D space. We want to build a PointCloud model where the ''Z'' coordinate of a point equals the height of this point in real world, so the next step is to redress the plane by projecting it on an ''X-O-Y'' space. For the projection calculation, we used the projection matrix mentioned in Figure 3.
The rotation is represented by the matrix:
[[File:planeredress.png|400px|center]]


Here, the plane equation is,
Please note that:
[[File:u1u2.png|400px|center|]]


'''ax + by + cz + d= 0
== Height Detection ==
'''
=== Point-to-Plane Based Height Calculation and City Elevation Model Construction ===
The redressing was performed in two steps, first translation and then rotation.


For translation, let, the plan intersects ''Z''-axis at '''''(0, 0, -d/c)'''''. So the translation is [[File:C.png|thumb|upright=1.5|center|]] which passes through the origin and vector '''v = ''(a, b, c)''<sup>T</sup>'''
Once the plane equation is obtained in the previous step, we could start calculating the height of buildings. In the first step, we calculate the relative height of the buildings based on the formula of point-to-plane distance. The coordinate information of each point could be accessed by the 'points' attribute of the ‘PointCloud’ object with the help of 'Open3D'. Since we already have the plane equation, using this formula, we could obtain the distances between each point and the plane and save them in a (n, 1) array.


For the rotation, note that the angle between '''v and k = ''(0, 0, 1)''<sup>T</sup>''' is given by:
To visualise the height of the building, the first method is to change the colors of the points in the PointCloud model. For each point in the ‘PointCloud’, it has a ‘Colors' attribute. By calling this attribute, an (n, 3) array of value range [0, 1], containing RGB colors information of the points will be returned. Therefore, by normalising the ‘distances array’ values to the range of [0,1] and expanding the 'distances array' to three dimensions by replication, we get a (n, 3) ‘height array' in the form of (n, 3) 'colors array’.


[[File:D.png|thumb|upright=1.5|center|]]
Basically, we transform the height information of each point to its 'colors' information, expressing the height by its color, higher points have lighter colors and vice versa. This model provides us with an intuitive sense of the building heights, which is more qualitative than quantitative.
[[File:venice_blacknwhite.png|thumb|1200px|upright=1.5|center|Figure 5: City Elevation Model]]


=== Coordinate Based Height Calculation and City Elevation Map Construction ===


The axis of rotation has to be orthogonal to v and k. so its versor is:
To show the building heights more quantitatively, we decided to construct a City Elevation Map, which could be achieved by plotting the height information of points in a 2D map. However, there is a prerequisite: we should get the actual height information of every point. After implementing "Plane Redressing", we assume that the z-coordinate can be approximated as the height. But before directly using it, we still need to scale the model so that the height of buildings in the model matches the actual height of buildings. For instance, we know the actual height of a reference point in the real world, and we could also obtain the height of this reference point in the PointCloud model, therefore, we could obtain the scale factor in this PointCloud model. For all the other buildings in the model, the actual height equals the height in the PointCloud model multiplying the scale factor.


[[File:E.png|thumb|upright=1.5|center|]]
To put it into practice, we need to set a reference point in the model, "Campanile di San Marco" is the best choice. The height of the highest building in Venice, "Campanile di San Marco", is 98.6 metres. In this step, the built-in function "Scale" from Open3D is adopted, and we finally obtained a 1:1 Venice PointCloud model where the z-coordinate of a point equals the actual height of this point in the real world.


the rotation is represented by the matrix:
{|class="wikitable" style="text-align:right; font-family:Arial, Helvetica, sans-serif !important;;"
[[File:planeredress.png|thumb|upright=1.5|center|Figure 3]]
|-
|[[File:Height_tab20c.png|280px|center|thumb|upright=1.5|Figure 6: Venice]]
|[[File:Youtube1_height_map.png|300px|center|thumb|upright=1.5|Figure 7: San Marco]]
|[[File:Youtube2_height_map.png|300px|center|thumb|upright=1.5|Figure 8: San Giorgio Maggiore]]
|[[File:Youtube3_height_map.png|240px|center|thumb|upright=1.5|Figure 9: Santa Maria della Salute]]
|}


== Cadaster Alignment ==
[[File:alignment.png|400px|right|thumb|upright=1.5|Alignment|Figure 10: Alignment]]
In order to map our City Elevation Map to the cadaster map, we use QGIS for georeferencing[https://www.sciencedirect.com/topics/social-sciences/georeferencing], which is the process of assigning locations to geographical objects within a geographic frame of reference. QGIS enables us to manually mark ground control points in both maps: Firstly, we select a representative location from the city elevation map, then we select the same location in the reference cadastre map. We set around ten pairs of ground control points to align the two maps, and the Venice zone EPSG 3004 should be chosen as the coordinate reference system when it comes to the transformation setting. The blue layer is the cadaster map while the red layer is our city elevation map. The georeferencing result is acceptable, where our height map matches the cadaster mostly accurately without distortion.


Please note that here:
= Quality Assessment =
[[File:u1u2.png|thumb|upright=1.5|center|]]


== Model Scaling ==
For each building, the average building height was obtained using Interactive point selection method of Open3D. To evaluate the precision of our model, we compared our average measured height with the actual building height.
The highest building in Venice, 'Campanile di San Marco’, is 98.6 meters high. We want to build a PointCloud model where the Z coordinate of a point equals the height of this point in real world, therefore, we have to scale the PointCloud.
[[File:Error_rate.PNG|550px|center]]


In order to achieve this goal, we set the height point in the PointCloud, which is the height point of 'Campanile di San Marco', as a reference point. we scale the model so that the ''Z'' coordinate of this point equals to 98.6. Built-in function from Open3D 'Scale' is adopted for the calculation.  
Then we take the absolute value of each building's error rate to compute the mean error.
The scale factor equals: real height/height in the model


== Point-to-Plane Based Height Calculation ==
For each model,


Once the plane equation is obtained in the previous step, we could start calculating the height of buildings.  
[[File:Mean.PNG|180px|center]] where n is the total number of buildings in the model
== Objective Assessment of Google-Earth-Based Venice Model ==
The mean error of the building height measurement in the Google Earth based Venice Model is '''3.3%''' according to landmarks and other buildings below.


In the first step, we calculate the relative height of the buildings based on the formula of point-to-plane distance. The coordinate information of each point was accessed by the 'points' property of the ‘PointCloud’ object with the help of 'Open3D'. Since we already have the plane equation, using this formula, we obtain the distance between each point and the plane.
[[File:venice_whole.png|1200px|center|thumb|upright=1.5|Figure 11: The Whole Venice]]
=== Assessment of Landmarks ===
In order to obtain the height information of below specific buildings in the model, we use function 'pick_points(pcd)' in Open3D to selct the vertex of building. Then we can get coordinate of the vertex and the z-coordinate represents the corresponding building's height.
{| class="wikitable" style="text-align:center;"
|- style="font-weight:bold; vertical-align:middle; background-color:#EAECF0;"
! colspan="5" |
|- style="font-weight:bold; vertical-align:middle; background-color:#EAECF0;"
| style="font-family:Arial, Helvetica, sans-serif !important;;" | Name of Buildings
| style="font-family:Arial, Helvetica, sans-serif !important;;" | Average Measured Height
| style="font-family:Arial, Helvetica, sans-serif !important;;" | Reference Height
| Error
| Pictures
|- style="background-color:#F8F9FA;"
| style="vertical-align:middle; font-family:Arial, Helvetica, sans-serif !important;;" | Basilica di San Marco
| style="vertical-align:middle; font-family:Arial, Helvetica, sans-serif !important;;" | 44m
| style="font-family:Arial, Helvetica, sans-serif !important;; color:#36B;" | 43m
| style="vertical-align:middle;" | 2.32%
| style="vertical-align:middle;" | [[File:BasilicadiSanMarco.png|200px|center]]
|- style="background-color:#F8F9FA;"
| style="vertical-align:middle; font-family:Arial, Helvetica, sans-serif !important;;" | Campanile di San Marco
| style="vertical-align:middle; font-family:Arial, Helvetica, sans-serif !important;;" | 99m
| style="font-family:Arial, Helvetica, sans-serif !important;; color:#36B;" | 98.6m
| style="vertical-align:middle;" | 0.40%
| style="vertical-align:middle;" | [[File:CampanilediSanMarco.png|200px|center]]
|- style="vertical-align:middle; background-color:#F8F9FA;"
| style="font-family:Arial, Helvetica, sans-serif !important;;" | Chiesa di San Giorgio Maggiore
| style="font-family:Arial, Helvetica, sans-serif !important;;" | 42m
| style="font-family:Arial, Helvetica, sans-serif !important;;" | NA
| NA
| rowspan="2" | [[File:ChiesadiSanGiorgioMaggiore.png|200px|center]]
|- style="background-color:#F8F9FA;"
| style="vertical-align:middle; font-family:Arial, Helvetica, sans-serif !important;;" | Campanile di San Giorgio Maggiore
| style="vertical-align:middle; font-family:Arial, Helvetica, sans-serif !important;;" | 72m
| style="font-family:Arial, Helvetica, sans-serif !important;; color:#36B;" | 63m
| style="vertical-align:middle;" | 12.5%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| style="font-family:Arial, Helvetica, sans-serif !important;;" | Basilica di Santa Maria della Salute
| style="font-family:Arial, Helvetica, sans-serif !important;;" | 64m
| style="font-family:Arial, Helvetica, sans-serif !important;;" | NA
| NA
| rowspan="2" | [[File:BasilicadiSantaMariadellaSalute.png|200px|center]]
|- style="vertical-align:middle; background-color:#F8F9FA;"
| style="font-family:Arial, Helvetica, sans-serif !important;;" | Campanile di Santa Maria della Salute
| style="font-family:Arial, Helvetica, sans-serif !important;;" | 49m
| style="font-family:Arial, Helvetica, sans-serif !important;;" | NA
| NA
|- style="background-color:#F8F9FA;"
| style="vertical-align:middle; font-family:Arial, Helvetica, sans-serif !important;;" | Basilica dei Santi Giovanni e Paolo
| style="vertical-align:middle; font-family:Arial, Helvetica, sans-serif !important;;" | 54m
| style="font-family:Arial, Helvetica, sans-serif !important;; color:#36B;" | 55.4m
| style="vertical-align:middle;" | 2.52%
| style="vertical-align:middle;" | [[File:BasilicadeiSantiGiovanniePaolo.png|200px|center]]
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | Campanile di Chiesa di San Geremia
| style="vertical-align:middle; background-color:#F8F9FA;" | 50m
| style="background-color:#F8F9FA; color:#36B;" | 50m
| style="vertical-align:middle; background-color:#F8F9FA;" | 0
| [[File:CampanilediChiesadiSanGeremia.png|200px|center]]
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | Campanile dei Gesuiti
| style="vertical-align:middle; background-color:#F8F9FA;" | 43m
| style="background-color:#F8F9FA; color:#36B;" | 40m
| style="vertical-align:middle; background-color:#F8F9FA;" | 7.50%
| [[File:CampaniledeiGesuiti.png|200px|center]]
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | Campanile San Francesco della Vigna
| style="vertical-align:middle; background-color:#F8F9FA;" | 76m
| style="background-color:#F8F9FA; color:#36B;" | 69m
| style="vertical-align:middle; background-color:#F8F9FA;" | 10.14%
| [[File:CampanileSanFrancescodellaVigna.png|200px|center]]
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | Chiesa del Santissimo Redentore
| style="vertical-align:middle; background-color:#F8F9FA;" | 47m
| style="background-color:#F8F9FA; color:#36B;" | 45m
| style="vertical-align:middle; background-color:#F8F9FA;" | 4.44%
| [[File:ChiesadelSantissimoRedentore.png|200px|center]]
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | Hilton Molino Stucky Venice
| style="vertical-align:middle; background-color:#F8F9FA;" | 45m
| style="vertical-align:middle; background-color:#F8F9FA;" | NA
| style="vertical-align:middle; background-color:#F8F9FA;" | NA
| [[File:HiltonMolinoStuckyVenice.png|200px|center]]
|}


In the second step, we use mapping to calculate the real height of the buildings. For instance, we know the real height of St. Mark Basilica, and we already calculated the relative height of St. Mark Basilica, therefore, we could obtain the scale factor in this PointCloud model. Then, for all the other buildings in the model, the real height of the building = the relative height of the building * scale factor.
=== Assessment of other Buildings ===


== City Elevation Model Construction ==
{| class="wikitable" style="text-align:center;"
To visualize the height of the building, we are going to build height map.
|- style="text-align:left;"
! colspan="4" | [[File:Assess2.jpg|550px|center|Figure 14: Buildings near Basilica dei Santi Giovanni e Paolo]]
! colspan="4" | [[File:Assess3.jpg|600px|center|Figure 15: Buildings near Basilica dei Frari]]
|- style="font-weight:bold; vertical-align:middle; background-color:#EAECF0;"
| colspan="4" | Buildings near Basilica dei Santi Giovanni e Paolo
| colspan="4" | Buildings near Basilica dei Frari
|- style="font-weight:bold; vertical-align:middle; background-color:#EAECF0;"
| Index
| Average Measure Height
| Reference Height
| Error
| Index
| Average Measure Height
| Reference Height
| Error
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 1
| 28.25m
| 28.1m
| 0.53%
| 11
| 63m
| 63.07m
| 0.11%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 2
| 19.5m
| 19.44m
| 0.30%
| 12
| 31m
| 30.94m
| 0.19%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 3
| 18.5m
| 18.77m
| 1.43%
| 13
| 20m
| 20.8m
| 3.84%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 4
| 23m
| 22.59m
| 1.81%
| 14
| 20m
| 21.28m
| 6.01%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 5
| 17m
| 16.95m
| 0.29%
| 15
| 26m
| 28.22m
| 7.86%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 6
| 31m
| 30.83m
| 0.55%
| 16
| 24m
| 24.92m
| 3.69%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 7
| 20.5m
| 21.02m
| 2.47%
| 17
| 26m
| 26.4m
| 1.51%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 8
| 17m
| 17.24m
| 1.39%
| 18
| 46m
| 46.58m
| 1.24%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 9
| 27.5m
| 26.31m
| 4.52%
| 19
| 15m
| 15.87m
| 5.48%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 10
| 19.5m
| 19.41m
| 0.46%
| 20
| 10m
| 10.56m
| 5.30%
|}


Nevertheless, we do not have the 2D coordinate information of the points since the PointCloud is a three dimentional object and the plane equation doesn’t match the plane in the coordinate system, we could not just plot the height information on a 2D map. An alternative visualization method is to express the height information of each point through the ‘Colors’ attribute of the ‘PointCloud’ object.
== Objective Assessment of Youtube-Video-Based Venice Model ==


For each point in the ‘PointCloud’, it has a ‘Colors' attribute, by calling it, an array of shape (n, 3), value range [0, 1], containing RGB colors information of the point will be returned. Therefore, by normalising the ‘distance array’ value to the range of [0,1], and expanding the distance array to three dimensions by replication, we get a ‘height array' of shape (n, 3) in the form of 'colors array’.
{| class="wikitable" style="text-align:center;"
|-
! colspan="4" | [[File:Assess4.png|550px|center|Figure 14: Buildings near Basilica Cattedrale Patriarcale di San Marco]]
! colspan="4" | [[File:Assess5.png|600px|center|Figure 15: Buildings near Basilica di Santa Maria della Salute]]
|- style="font-weight:bold;"vertical-align:middle; font-size:16px; background-color:#EAECF0;"
| colspan="4" | Buildings near Basilica Cattedrale Patriarcale di San Marco
| colspan="4" | Buildings near Basilica di Santa Maria della Salute
|- style="font-weight:bold;"vertical-align:middle; font-size:16px; background-color:#F8F9FA;"
| Index
| Average Measure Height
| Reference Height
| style="font-weight:bold;" | Error
| Index
| Average Measure Height
| Reference Height
| Error
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 1
| 99m
| 98.6m
| 0.40%
| 8
| 21m
| 21.76m
| 3.49%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 2
| 44m
| 43m
| 2.32%
| 9
| 19m
| 20.12m
| 5.56%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 3
| 30m
| 30.59m
| 1.92%
| 10
| 11m
| 11.15m
| 1.34%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 4
| 24.33m
| 23.21m
| 4.82%
| 11
| 22m
| 23.21m
| 5.21%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 5
| 26m
| 26.35m
| 1.32%
| 12
| 11m
| 12.07m
| 8.86%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 6
| 21m
| 22.21m
| 5.44%
| 13
| 14m
| 14.18m
| 1.26%
|- style="vertical-align:middle; background-color:#F8F9FA;"
| 7
| 21m
| 21.53m
| 2.46%
| 14
| 16m
| 16.98m
| 5.77%
|}


== Cadaster Aligmment ==
== Subjective Assessment of Google-Earth-Based Venice Model ==
During the subjective assessment, We gather ten volunteers to give their subjective scores on our part of Digital Elevation map to evaluate the visual quality of the map. Here we use Double Stimulus Impairment Scale (DSIS) which is a type of double stimulus subjective experiment in which pairs of images are shown next to each other to a group of people for the rating. One stimulus is always the reference image, while the other is the image that needs to be evaluated. The participants are asked to rate the images according to a five-levels grading scale according to the impairment that they are able to perceive between the two images (5-Imperceptible, 4-Perceptible but not annoying, 3-Slightly annoying, 2-Annoying, 1-Very annoying). We will compute the Differential Mean Opinion Score (DMOS) for the below two months. The DMOS score can be computed as:


= Quality Assessment =
[[File:DMOS.png|150px|center|]]
= Limitations =


*Problems in Youtube Video data
where N is the number of participants and DV<sub>ij</sub> is the differential score by participant i for the stimulus j. The score of the reference maps are always set to be 5.The DMOS score of the images are:


In the very beginning, our initial plan was to detect building heights based on PointCloud models generated by Youtube drone videos. Nevertheless, only a few of the videos are of high quality. Among those qualified videos, most videos only focused on monumental architecture such as St Mark's Square, St. Mark's Basilica, and Doge's Palace, where other buildings have a different extent of deficiencies from various angles in the point clouds. Therefore, a large proportion of Venice could not be 3D reconstructed if we only adopt Youtube Video Method.
[[File:DV.png|150px|center|]]
{| class="wikitable" style="text-align:center;"
|- style="font-weight:bold;"
! colspan="2" style="vertical-align:middle; background-color:#EAECF0;" | DMOS
! rowspan="13" style="vertical-align:middle; background-color:#EAECF0;" | [[File:Red_1.png|500px|center|thumb|upright=1.5|Figure 16: Details of the City Elevation Map (First Image)]]
! rowspan="13" style="font-weight:normal; text-align:left;" | [[File:Red_2.png|500px|center|thumb|upright=1.5|Figure 17: Details of the Given Cadaster (First Reference)]]
|- style="font-weight:bold;"
| style="vertical-align:middle; background-color:#F8F9FA;" | Participant
| style="vertical-align:middle; background-color:#F8F9FA;" | First Image
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 1
| style="vertical-align:middle; background-color:#F8F9FA;" | 5
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 2
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 3
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 5
| style="vertical-align:middle; background-color:#F8F9FA;" | 5
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 6
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 7
| style="vertical-align:middle; background-color:#F8F9FA;" | 5
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 8
| style="vertical-align:middle; background-color:#F8F9FA;" | 5
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 9
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 10
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | Average
| style="vertical-align:middle; background-color:#F8F9FA;" | 4.4
|}
{| class="wikitable" style="text-align:center;"
|- style="font-weight:bold; vertical-align:middle; background-color:#EAECF0;"
! colspan="2" | DMOS
! rowspan="13" | [[File:R3.png|500px|center|thumb|upright=1.5|Figure 18: Details of the City Elevation Map (Second Image)]]
! rowspan="13" | [[File:R4.png|500px|center|thumb|upright=1.5|Figure 19: Details of the Given Cadaster (Second Reference)]]
|- style="font-weight:bold;"
| style="vertical-align:middle; background-color:#F8F9FA;" | Participant
| style="vertical-align:middle; background-color:#F8F9FA;" | Second Image
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 1
| style="vertical-align:middle; background-color:#F8F9FA;" | 5
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 2
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 3
| style="vertical-align:middle; background-color:#F8F9FA;" | 3
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 5
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 6
| style="vertical-align:middle; background-color:#F8F9FA;" | 3
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 7
| style="vertical-align:middle; background-color:#F8F9FA;" | 5
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 8
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 9
| style="vertical-align:middle; background-color:#F8F9FA;" | 4
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | 10
| style="vertical-align:middle; background-color:#F8F9FA;" | 5
|-
| style="vertical-align:middle; background-color:#F8F9FA;" | Average
| style="vertical-align:middle; background-color:#F8F9FA;" | 4.1
|}
According to the DMOS result, both images have high scores which means acceptable fidelity in terms of subjective quality assessment.


*Challenges in model implementation from Google Earth Image
= Display of YouTube-Video-Based Venice Models =
== Basilica Cattedrale Patriarcale di San Marco ==
{|class="wikitable" style="text-align:right; font-family:Arial, Helvetica, sans-serif !important;;"
|-
|[[File:tiedpoint2.png|280px|center|thumb|upright=1.5|Figure 20: Tied Point Model]]
|[[File:densecloud2.png|280px|center|thumb|upright=1.5|Figure 21: Dense Cloud Model]]
|[[File:tiled2.png|280px|center|thumb|upright=1.5|Figure 22: Tiled Model]]
|[[File:mesh2.png|280px|center|thumb|upright=1.5|Figure 23: Mesh Model]]
|}


Using Google Earth images is a solution to the problems mentioned above. But it is not a perfect method either. We seem to be in a dilemma when we tried to collect images from Google Earth: When we took screenshots from a long distance from the ground, the resolution of images is very low, the details of the buildings are not well rendered, thus causing the low quality of the PointCloud. When we took screenshots from a short distance from the ground, we get a much better result, the details are much more clear, but to achieve this, the number of images we need to build the PointCloud models increase exponentially, and the calculation is almost impossible for a Personal Computer to afford. One possible solution is dividing Venice into smaller parts, building a PointCloud for every part, and merging the PointClouds. This solution might require alignment, which isn't precise and is well-known for being hard to implement.  
== Basilica di San Giorgio Maggiore ==
{|class="wikitable" style="text-align:right; font-family:Arial, Helvetica, sans-serif !important;;"
|-
|[[File:tiedpoint.png|280px|center|thumb|upright=1.5|Figure 24: Tied Point Model]]
|[[File:Densecloud.png|280px|center|thumb|upright=1.5|Figure 25: Dense Cloud Model]]
|[[File:tiled.png|280px|center|thumb|upright=1.5|Figure 26: Tiled Model]]
|[[File:mesh.png|280px|center|thumb|upright=1.5|Figure 27: Mesh Model]]
|}


*Inaccurate Plane Generation and Redressing
== Basilica di Santa Maria della Salute ==
{|class="wikitable" style="text-align:right; font-family:Arial, Helvetica, sans-serif !important;;"
|-
|[[File:tied_2.png|280px|center|thumb|upright=1.5|Figure 28: Tied Point Model]]
|[[File:dense_2.png|280px|center|thumb|upright=1.5|Figure 29: Dense Cloud Model]]
|[[File:tiled_2.png|280px|center|thumb|upright=1.5|Figure 30: Tiled Model]]
|[[File:mesh_2.png|280px|center|thumb|upright=1.5|Figure 31: Mesh Model]]
|}


Plane Generation isn't perfect. When we implement 'Plane Redress', the ideal scenario is that after transformation, the plane equation should be  '''0x + 0y + z = 0'''.
= Limitations and Further Work=
However, when we implement the transformation, the coefficients before x and y are not exactly 0, even though they get closer to 0 after each iteration.


= Milestones =
== Limitations ==


* '''Challenges in model implementation from Google Earth Image'''


=== Milestone 1 ===
Using Google Earth images is a solution to the problems mentioned above. But it is not a perfect method either. It seemed that we were in a dilemma when we tried to collect images from Google Earth: When we took screenshots from a long distance from the ground, the resolution of images was very low, the details of the buildings were not well rendered, thus causing the low quality of the PointCloud model. When we took screenshots from a short distance from the ground, we obtained a much better result, the details were much clearer. But to get this result, the number of images we needed to build the PointCloud model increase exponentially, and the calculation was almost impossible for a Personal Computer to afford.


* Get familiar with [https://github.com/openMVG/openMVG OpenMVG], [http://www.open3d.org/ Open3D], [https://arxiv.org/abs/1206.0111 OpenGM], [https://pointclouds.org/ Point Cloud Library], [https://www.agisoft.com/ Agisoft Metashape], [https://www.blender.org/ Blender], [https://www.cloudcompare.org/main.html CloudCompare] and [https://www.qgis.org/en/site/ QGIS]
One possible solution is dividing Venice into smaller parts, building a PointCloud for every part, and merging the PointClouds models together in the end. However, this solution requires registration, which is not only quite imprecise, but also hard to implement.  


* Collect high resolution venice drone videos on youtube and record Google Earth birdview videos as supplementary materials
* '''Inaccurate Plane Generation and Redressing'''


=== Milestone 2 ===
One of the inherent defects in the methodology is 'Plane Generation' and 'Plane Redressing' are imperfect. When we implement 'Plane Redressing', the ideal scenario is that after a single transformation, the plane equation would become  ''0x + 0y + z = 0''. However, when we implement the transformation, it is not the case: the coefficients before x and y are not exactly 0. The error caused by this inherent defect could be neutralised by implementing the transformation for 1000 times: When the number of iterations increases, the coefficients before x and y approaches zero infinitely.


* Align photos of the same location in Venice, derive sparse point clouds made up of only high-quality tie points and repeated optimize the camara model by reconstruction uncertainty filtering, projection accuracy filtering and reprojection error filtering.
== Further Work ==
* Build dense point clouds using the estimated camera positions generated during sparse point cloud based matching and the depth map for each camera
* Evaluate point clouds objective quality and select high quality models by point-based approaches which can assess both geometry and color distortions


=== Milestone 3 ===
* '''More Precise Georeferencing and Pixel Matching'''


* Align and compose dense point clouds of different spots to generate an integrated Venice dense cloud model (implement registration with partially overlapping terrestrial point clouds)
We only choose ten ground control points for QGIS to georeference the raster height map. Generally, The more points you select, the more accurate the image is registered to the target coordinates of the cadaster. In the future, after choosing enough ground control points, we will compare our Venice elevation map with the cadaster pixel by pixel in the same resolution. Both maps use the same color and grey scales.
* Build Venice 3D model (mesh) and tilted model according to dense point cloud data
* Generate plane of the surface ground


=== Milestone 4 ===
* '''Building Contour Sharpening'''


* Build the digital elevation model of Venice and align the model with open street map and cadaster in the 16th century to obtain building heights
In our Venice elevation map, the building footprint is blurred and not well detected in some areas. In the future, we will use Shapely in Python to find the polygons and OpenCV to implement contour approximation.
* Assess the accuracy of the building heights detection


= Planning =
= Planning =
Line 162: Line 577:
|
|
* Configure the environment and master the related softwares and libraries
* Configure the environment and master the related softwares and libraries
* Find suitable drone videos on YouTube and obtain photos for further photogrammetry work
* Find suitable drone videos on YouTube and extract photos for further photogrammetry work
| align="center" | ✓
| align="center" | ✓
|-
|-
| week 8-9
| week 8-9
|
|
* Align photos, generate and denoise dense point clouds
* Align photos, generate and denoise dense point clouds, build mesh and tiled models of Venice
* Assess the quality of point clouds both subjectively and objectively
* Collect screenshots of the whole Venice on Google Earth
| align="center" | ✓
| align="center" | ✓
|-
|-
| week 10
| week 10
|
|
* Implement ground segmentation
* Generate the plane of point cloud modela
* Collect screenshots of buildings from Venice on Google Earth
* Prepare for the midterm presentation
* Prepare for the midterm presentation
| align="center" | ✓
| align="center" | ✓
Line 180: Line 594:
| week 11
| week 11
|
|
* Calculate the height of buildings and generate map layers with building height information
* Construct City Elevation Model
* Evaluate the models based on the overlapped parts in the map layer with building height information
* Redress the plane and generate City Elevation Map
| align="center" | ✓
|-
|-
| week 12
| week 12
|
|
* Build 3D model, tiled model of Venice
* Align our City Elevation Map to the given Venice cadaster
* Align the height information layers with OpenStreetMap
* Assess the visual quality and building's height accuracy of the Google earth based Venice model, and buildings' height accuracy of the Youtube based Venice model
| align="center" | ✓
|-
|-
| week 13
| week 13
|
|
* Write final report, refine the map layers with building height information and the final models
* Write final report, refine the map layers with building height information and the final models
| align="center" | ✓
|-
|-
| week 14
| week 14
|
|
* Refine the final report and code
* Refine the final report, code and models
* Prepare for final presentation
* Prepare for the final presentation
| align="center" | ✓
|-
|-
|}
|}
= References =
[1] https://pubs.er.usgs.gov/publication/ofr20211039
= Deliverables =
* Source Codes:
[https://github.com/yetiiil/venice_building_height https://github.com/yetiiil/venice_building_height]
* Venice Point Cloud Models:
[https://filesender.switch.ch/filesender2/?s=download&token=0b7afa31-f914-43e5-899d-ba29402f3754 Google Earth Model]
[https://filesender.switch.ch/filesender2/?s=download&token=12fdf964-9d38-42f7-9270-81bbdb0711d7 Basilica di Santa Maria della Salute Model]
[https://filesender.switch.ch/filesender2/?s=download&token=39e18118-9146-412e-881c-6e23409bef9f Basilica di San Giorgio Maggiore Model]
[https://filesender.switch.ch/filesender2/?s=download&token=96271d6e-90c2-4d2c-a2b0-351df08c6945 Basilica Cattedrale Patriarcale di San Marco Model]
In case the files expire, please contact yuxiao.li@epfl.ch

Latest revision as of 11:58, 14 January 2022

Introduction

In this project, our main goal is to obtain the height information of buildings in Venice city. In order to achieve this goal, we construct a point cloud model of Venice from Google Earth and YouTube drone videos with the help of photogrammetry tools. Initially, we experimented on drone videos from YouTube. Since our goal is to detect the height of all the buildings in Venice, it is important for us to collect images of all the buildings in the city. However, in YouTube videos, we only found some landmark architecture whose height information is already available on the Internet. Hence, we switched the data scource to Google Earth images. With our current image source, we have successfully calculated the heights of Venice buildings including unknown buildings and famous ones. We also evaluated both subjective and objective quality of our City Elevation Map, which will be discussed in the following sections. The Venice 3d model along with height information can be used to create an virtual experience. Besides, the City Elevation map can be useful for urban planning and security purposes in the future.

Motivation

Venice, one of the most extraordinary cities in the world, was built on 118 islands in the middle of the Venetian Lagoon at the head of the Adriatic Sea in Northern Italy. The planning of the city, to be floating in a lagoon of water, reeds, and marshland always amazes travellers and architects. The beginning of the long and rich history of Venice was on March 25th 421 AD (At high noon). Initially, the city was made using mud and woods. The revolution of the city plan has changed over time since it was built. Today, "the floating city of Italy" is called "the sinking city". Over the past 100 years, the city has sunk 9 inches. According to environmentalists, because of global warming, the sea level will rise, which will eventually cover this beautiful city in upcoming years. To us engineers, it is important to keep track of the building height changes of Venice in order to take relevant actions in time to save the beautiful city. Since changes in natural disasters are sometimes unpredictable, to save the present beauty of Venice, we can recreate the Venice visiting experience virtually as a part of Venice Time Machine. This in the future will be relevant for travellers, historians, architects, and other enthusiasts in the world. Apart from the facts mentioned above, building height is one of the most important pieces of information to consider for urban planning, economic analysis and digital twin implementation.

Milestones

Milestone 1

  • Collect high resolution venice drone videos on youtube and Google Earth images as supplementary materials

Milestone 2

  • Align photos of Venice, generate sparse point clouds made up of only high-quality tie points and repeatedly optimize point cloud models by reconstruction uncertainty filtering, projection accuracy filtering and reprojection error filtering
  • Remove outliers automatically and manually, and build dense point clouds based on sparse cloud
  • Build Venice 3D model (mesh) and tiled model according to dense point cloud data

Milestone 3

  • Generate ground plane with the largest support in the point cloud and redress the plane by translation and rotation
  • Construct City Elevation Model by a point-to-plane method and generate City Elevation Model based on z-coordinate after redressing

Milestone 4

  • Align the City Elevation Map to the reference cadaster using QGIS's georeferencing and evaluate the subjective visual quality of the map
  • Evaluate the accuracy of height calculation of both Google Earth based Venice model and Youtube based Venice model

Methodology

Images Aquisition

Figure 1: YouTube Video

Point cloud denotes a 3D content representation which is commonly preferred due to its high efficiency and relatively low complexity in the acquisition, storage, and rendering of 3D models. In order to generate Venice dense point cloud models, sequential images from different angles in different locations need to be collected first based on the following two sources.

Figure 2: Google Earth

Youtube Video Method

Our initial plan was to download drone videos of Venice on the Youtube platform, used FFmpeg to extract images from the video one frame per second, and generate dense point cloud models based on the images that we acquired. However, the videos we could get from drone videos are very limited: only a few of the videos are of high quality. Among those qualified videos, most videos only focused on monumental architecture such as St Mark's Square, St. Mark's Basilica, and Doge's Palace, where other buildings have a different extent of deficiencies from various angles in the point clouds. Therefore, a large proportion of Venice could not be 3D reconstructed if we only adopt Youtube Video Method.

Google Earth Method

In order to access the comprehensive area of Venice, we decided to use Google Earth as a complementary tool: we could obtain images of every building we want at each angle. With Google Earth, we are able to have much more access to aerial images of Venice, not only will these images serve as images set to generate dense point cloud models in order to calculate building heights, but we can also use the overlapping part in different point cloud models to evaluate point cloud models in a cross-comparison way.

In a nutshell, we apply a photogrammetric reconstruction tool to generate point cloud models. Photogrammetry is the art and science of extracting 3D information from photographs. As for the Youtube-based method, we will only target famous buildings which appear in different drone videos and select suitable images to form the specific building's image set. With regard to the Google Earth method, we will obtain images of the whole of Venice manually in a scientific trajectory of the photogrammetry route.

Point Cloud Generation

Sparse cloud is generated after aligning the multi-angle photos, which can observe overlapping photo pairs. Then the dense point cloud uses the estimated camera positions generated during tie points based matching and the depth map for each camera, which can have greater density than LiDAR point clouds.[1]

Point Cloud Optimization

Outlier Removal

When collecting data from scanning devices, the resulting point cloud tends to contain noise and artifacts that one would like to remove. Apart from selecting points in Agisoft Metashape manually, we also use Open3D to deal with the noises and artefacts. In the Open3D, there are two ways to detect the outliers and remove them: statistical outlier removal and radius outlier removal. Radius outlier removal removes points that have few neighbors in a given sphere around them, and statistical outlier removal removes points that are further away from their neighbors compared to the average for the point cloud. We set the number of the neighbors to be 50, radius to be 0.01. Automatic outlier removal could remove around 75% percent of noise, but the effect isn't perfect, which is not as good as the manually gradual selection. Therefore, Automatic outlier removal could serve as a prepossessing step for the final de-noise, after automatic outlier removal, we still need to manually de-noise the point cloud.

Error Reduction

Figure 3: YouTube Video

The first phase of error reduction is to remove points that are the consequence of poor camera geometry. Reconstruction uncertainty is a numerical representation of the uncertainty in the position of a tie point based on the geometric relationship of the cameras from which that point was projected or triangulated, which is the ratio between the largest and smallest semi axis of the error ellipse created when triangulating 3D point coordinates between two images. Points constructed from image locations that are too close to each other have a low base-to-height ratio and high uncertainty. Removing these points does not affect the accuracy of optimization but reduces the noise in the point cloud and prevents points with large uncertainty in the z-axis from influencing other points with good geometry or being incorrectly removed in the reprojection error step. A reconstruction uncertainty of 10, which can be selected with the gradual selection filter and set to this value or level, is roughly equivalent to a good base-to-height ratio of 1:2.3 , whereas 15 is roughly equivalent to a marginally acceptable base-to-height ratio of 1:5.5. The reconstruction uncertainty selection procedures are repeated two times to reduce the reconstruction uncertainty toward 10 without having to delete more than 50 percent of the tie points each time.[1]

The second phase of error reduction phase removes points based on projection accuracy. Projection accuracy is essentially a representation of the precision with which the tie point can be known given the size of the key points that intersect to create it. The key point size is the standard deviation value of the Gaussian blur at the scale at which the key point was found. The smaller the mean key point value, the smaller the standard deviation and the more precisely located the key point is in the image. The highest accuracy points are assigned to level 1 and are weighted based on the relative size of the pixels. A tie point assigned to level 2 has twice as much projection inaccuracy as level 1. The projection accuracy selection procedure is repeated without having to delete more than 50% of the points each time until level 2 is reached and there are few points selected. [1]

The final phase of error reduction phase is removing points based on reprojection error. This is a measure of the error between a 3D point's original location on the image and the location of the point when it is projected back to each image used to estimate its position. Error-values are normalized based on key point size. High reprojection error usually indicates poor localization accuracy of the corresponding point projections at the point matching step. Reprojection error can be reduced by iteratively selecting and deleting points, then optimizing until the unweighted RMS reprojection error is between 0.13 and 0.18 which means that there is 95% confidence that the remaining points meet the estimated error.[1]

Plane Generation

Figure 4: Plane Generation

To calculate the height of the building, we have to find the ground of the Venice PointCloud model, which could be retrieved by a tool provided by Open3D. Open3D supports segmentation of geometric primitives from point clouds using "Random Sample Consensus", which is an iterative method to estimate parameters of a mathematical model from a set of observed data that contains outliers when outliers are to be accorded no influence on the values of the estimates. To find the ground plane with the largest support in the point cloud, we use segment_plane by specifying "distance_threshold". The support of a hypothesised plane is the set of all 3D points whose distance from that plane is at most some threshold (e.g. 10 cm). After each run of the RANSAC algorithm, the plane that had the largest support is chosen. All the points supporting that plane may be used to refine the plane equation, which is more robust than just using the neighbouring points by performing PCA or linear regression on the support set.

Plane Redressing

After selecting the most convenient plane and its supporting points, we define the plane equation in the 3D space. We want to build a PointCloud model where the Z coordinate of a point equals the height of this point in real world, so the next step is to redress the plane by projecting it on an X-O-Y space. For the projection calculation, we used the projection matrix mentioned below.

Here, the plane equation is ax + by + cz + d = 0 The redressing was performed in two steps, first translation and then rotation.

For translation, the plane intersects Z-axis at (0, 0, -d/c). So the translation is

C.png

which passes through the origin and vector v = (a, b, c)T

For the rotation, the angle between v and k = (0, 0, 1)T' is given by:

D.png


The axis of rotation has to be orthogonal to v and k. so its versor is:

E.png

The rotation is represented by the matrix:

Planeredress.png

Please note that:

U1u2.png

Height Detection

Point-to-Plane Based Height Calculation and City Elevation Model Construction

Once the plane equation is obtained in the previous step, we could start calculating the height of buildings. In the first step, we calculate the relative height of the buildings based on the formula of point-to-plane distance. The coordinate information of each point could be accessed by the 'points' attribute of the ‘PointCloud’ object with the help of 'Open3D'. Since we already have the plane equation, using this formula, we could obtain the distances between each point and the plane and save them in a (n, 1) array.

To visualise the height of the building, the first method is to change the colors of the points in the PointCloud model. For each point in the ‘PointCloud’, it has a ‘Colors' attribute. By calling this attribute, an (n, 3) array of value range [0, 1], containing RGB colors information of the points will be returned. Therefore, by normalising the ‘distances array’ values to the range of [0,1] and expanding the 'distances array' to three dimensions by replication, we get a (n, 3) ‘height array' in the form of (n, 3) 'colors array’.

Basically, we transform the height information of each point to its 'colors' information, expressing the height by its color, higher points have lighter colors and vice versa. This model provides us with an intuitive sense of the building heights, which is more qualitative than quantitative.

Figure 5: City Elevation Model

Coordinate Based Height Calculation and City Elevation Map Construction

To show the building heights more quantitatively, we decided to construct a City Elevation Map, which could be achieved by plotting the height information of points in a 2D map. However, there is a prerequisite: we should get the actual height information of every point. After implementing "Plane Redressing", we assume that the z-coordinate can be approximated as the height. But before directly using it, we still need to scale the model so that the height of buildings in the model matches the actual height of buildings. For instance, we know the actual height of a reference point in the real world, and we could also obtain the height of this reference point in the PointCloud model, therefore, we could obtain the scale factor in this PointCloud model. For all the other buildings in the model, the actual height equals the height in the PointCloud model multiplying the scale factor.

To put it into practice, we need to set a reference point in the model, "Campanile di San Marco" is the best choice. The height of the highest building in Venice, "Campanile di San Marco", is 98.6 metres. In this step, the built-in function "Scale" from Open3D is adopted, and we finally obtained a 1:1 Venice PointCloud model where the z-coordinate of a point equals the actual height of this point in the real world.

Figure 6: Venice
Figure 7: San Marco
Figure 8: San Giorgio Maggiore
Figure 9: Santa Maria della Salute

Cadaster Alignment

Figure 10: Alignment

In order to map our City Elevation Map to the cadaster map, we use QGIS for georeferencing[1], which is the process of assigning locations to geographical objects within a geographic frame of reference. QGIS enables us to manually mark ground control points in both maps: Firstly, we select a representative location from the city elevation map, then we select the same location in the reference cadastre map. We set around ten pairs of ground control points to align the two maps, and the Venice zone EPSG 3004 should be chosen as the coordinate reference system when it comes to the transformation setting. The blue layer is the cadaster map while the red layer is our city elevation map. The georeferencing result is acceptable, where our height map matches the cadaster mostly accurately without distortion.

Quality Assessment

For each building, the average building height was obtained using Interactive point selection method of Open3D. To evaluate the precision of our model, we compared our average measured height with the actual building height.

Error rate.PNG

Then we take the absolute value of each building's error rate to compute the mean error.

For each model,

Mean.PNG

where n is the total number of buildings in the model

Objective Assessment of Google-Earth-Based Venice Model

The mean error of the building height measurement in the Google Earth based Venice Model is 3.3% according to landmarks and other buildings below.

Figure 11: The Whole Venice

Assessment of Landmarks

In order to obtain the height information of below specific buildings in the model, we use function 'pick_points(pcd)' in Open3D to selct the vertex of building. Then we can get coordinate of the vertex and the z-coordinate represents the corresponding building's height.

Name of Buildings Average Measured Height Reference Height Error Pictures
Basilica di San Marco 44m 43m 2.32%
BasilicadiSanMarco.png
Campanile di San Marco 99m 98.6m 0.40%
CampanilediSanMarco.png
Chiesa di San Giorgio Maggiore 42m NA NA
ChiesadiSanGiorgioMaggiore.png
Campanile di San Giorgio Maggiore 72m 63m 12.5%
Basilica di Santa Maria della Salute 64m NA NA
BasilicadiSantaMariadellaSalute.png
Campanile di Santa Maria della Salute 49m NA NA
Basilica dei Santi Giovanni e Paolo 54m 55.4m 2.52%
BasilicadeiSantiGiovanniePaolo.png
Campanile di Chiesa di San Geremia 50m 50m 0
CampanilediChiesadiSanGeremia.png
Campanile dei Gesuiti 43m 40m 7.50%
CampaniledeiGesuiti.png
Campanile San Francesco della Vigna 76m 69m 10.14%
CampanileSanFrancescodellaVigna.png
Chiesa del Santissimo Redentore 47m 45m 4.44%
ChiesadelSantissimoRedentore.png
Hilton Molino Stucky Venice 45m NA NA
HiltonMolinoStuckyVenice.png

Assessment of other Buildings

Figure 14: Buildings near Basilica dei Santi Giovanni e Paolo
Figure 15: Buildings near Basilica dei Frari
Buildings near Basilica dei Santi Giovanni e Paolo Buildings near Basilica dei Frari
Index Average Measure Height Reference Height Error Index Average Measure Height Reference Height Error
1 28.25m 28.1m 0.53% 11 63m 63.07m 0.11%
2 19.5m 19.44m 0.30% 12 31m 30.94m 0.19%
3 18.5m 18.77m 1.43% 13 20m 20.8m 3.84%
4 23m 22.59m 1.81% 14 20m 21.28m 6.01%
5 17m 16.95m 0.29% 15 26m 28.22m 7.86%
6 31m 30.83m 0.55% 16 24m 24.92m 3.69%
7 20.5m 21.02m 2.47% 17 26m 26.4m 1.51%
8 17m 17.24m 1.39% 18 46m 46.58m 1.24%
9 27.5m 26.31m 4.52% 19 15m 15.87m 5.48%
10 19.5m 19.41m 0.46% 20 10m 10.56m 5.30%

Objective Assessment of Youtube-Video-Based Venice Model

Figure 14: Buildings near Basilica Cattedrale Patriarcale di San Marco
Figure 15: Buildings near Basilica di Santa Maria della Salute
Buildings near Basilica Cattedrale Patriarcale di San Marco Buildings near Basilica di Santa Maria della Salute
Index Average Measure Height Reference Height Error Index Average Measure Height Reference Height Error
1 99m 98.6m 0.40% 8 21m 21.76m 3.49%
2 44m 43m 2.32% 9 19m 20.12m 5.56%
3 30m 30.59m 1.92% 10 11m 11.15m 1.34%
4 24.33m 23.21m 4.82% 11 22m 23.21m 5.21%
5 26m 26.35m 1.32% 12 11m 12.07m 8.86%
6 21m 22.21m 5.44% 13 14m 14.18m 1.26%
7 21m 21.53m 2.46% 14 16m 16.98m 5.77%

Subjective Assessment of Google-Earth-Based Venice Model

During the subjective assessment, We gather ten volunteers to give their subjective scores on our part of Digital Elevation map to evaluate the visual quality of the map. Here we use Double Stimulus Impairment Scale (DSIS) which is a type of double stimulus subjective experiment in which pairs of images are shown next to each other to a group of people for the rating. One stimulus is always the reference image, while the other is the image that needs to be evaluated. The participants are asked to rate the images according to a five-levels grading scale according to the impairment that they are able to perceive between the two images (5-Imperceptible, 4-Perceptible but not annoying, 3-Slightly annoying, 2-Annoying, 1-Very annoying). We will compute the Differential Mean Opinion Score (DMOS) for the below two months. The DMOS score can be computed as:

DMOS.png

where N is the number of participants and DVij is the differential score by participant i for the stimulus j. The score of the reference maps are always set to be 5.The DMOS score of the images are:

DV.png
DMOS
Figure 16: Details of the City Elevation Map (First Image)
Figure 17: Details of the Given Cadaster (First Reference)
Participant First Image
1 5
2 4
3 4
4 4
5 5
6 4
7 5
8 5
9 4
10 4
Average 4.4
DMOS
Figure 18: Details of the City Elevation Map (Second Image)
Figure 19: Details of the Given Cadaster (Second Reference)
Participant Second Image
1 5
2 4
3 3
4 4
5 4
6 3
7 5
8 4
9 4
10 5
Average 4.1

According to the DMOS result, both images have high scores which means acceptable fidelity in terms of subjective quality assessment.

Display of YouTube-Video-Based Venice Models

Basilica Cattedrale Patriarcale di San Marco

Figure 20: Tied Point Model
Figure 21: Dense Cloud Model
Figure 22: Tiled Model
Figure 23: Mesh Model

Basilica di San Giorgio Maggiore

Figure 24: Tied Point Model
Figure 25: Dense Cloud Model
Figure 26: Tiled Model
Figure 27: Mesh Model

Basilica di Santa Maria della Salute

Figure 28: Tied Point Model
Figure 29: Dense Cloud Model
Figure 30: Tiled Model
Figure 31: Mesh Model

Limitations and Further Work

Limitations

  • Challenges in model implementation from Google Earth Image

Using Google Earth images is a solution to the problems mentioned above. But it is not a perfect method either. It seemed that we were in a dilemma when we tried to collect images from Google Earth: When we took screenshots from a long distance from the ground, the resolution of images was very low, the details of the buildings were not well rendered, thus causing the low quality of the PointCloud model. When we took screenshots from a short distance from the ground, we obtained a much better result, the details were much clearer. But to get this result, the number of images we needed to build the PointCloud model increase exponentially, and the calculation was almost impossible for a Personal Computer to afford.

One possible solution is dividing Venice into smaller parts, building a PointCloud for every part, and merging the PointClouds models together in the end. However, this solution requires registration, which is not only quite imprecise, but also hard to implement.

  • Inaccurate Plane Generation and Redressing

One of the inherent defects in the methodology is 'Plane Generation' and 'Plane Redressing' are imperfect. When we implement 'Plane Redressing', the ideal scenario is that after a single transformation, the plane equation would become 0x + 0y + z = 0. However, when we implement the transformation, it is not the case: the coefficients before x and y are not exactly 0. The error caused by this inherent defect could be neutralised by implementing the transformation for 1000 times: When the number of iterations increases, the coefficients before x and y approaches zero infinitely.

Further Work

  • More Precise Georeferencing and Pixel Matching

We only choose ten ground control points for QGIS to georeference the raster height map. Generally, The more points you select, the more accurate the image is registered to the target coordinates of the cadaster. In the future, after choosing enough ground control points, we will compare our Venice elevation map with the cadaster pixel by pixel in the same resolution. Both maps use the same color and grey scales.

  • Building Contour Sharpening

In our Venice elevation map, the building footprint is blurred and not well detected in some areas. In the future, we will use Shapely in Python to find the polygons and OpenCV to implement contour approximation.

Planning

Week Tasks Completion
week 5
  • Brainstorm and present initial ideas for the project
week 6-7
  • Configure the environment and master the related softwares and libraries
  • Find suitable drone videos on YouTube and extract photos for further photogrammetry work
week 8-9
  • Align photos, generate and denoise dense point clouds, build mesh and tiled models of Venice
  • Collect screenshots of the whole Venice on Google Earth
week 10
  • Generate the plane of point cloud modela
  • Prepare for the midterm presentation
week 11
  • Construct City Elevation Model
  • Redress the plane and generate City Elevation Map
week 12
  • Align our City Elevation Map to the given Venice cadaster
  • Assess the visual quality and building's height accuracy of the Google earth based Venice model, and buildings' height accuracy of the Youtube based Venice model
week 13
  • Write final report, refine the map layers with building height information and the final models
week 14
  • Refine the final report, code and models
  • Prepare for the final presentation

References

[1] https://pubs.er.usgs.gov/publication/ofr20211039

Deliverables

  • Source Codes:

https://github.com/yetiiil/venice_building_height

  • Venice Point Cloud Models:

Google Earth Model

Basilica di Santa Maria della Salute Model

Basilica di San Giorgio Maggiore Model

Basilica Cattedrale Patriarcale di San Marco Model

In case the files expire, please contact yuxiao.li@epfl.ch