DSF-Net: semantic segmentation of large-scale point clouds based on integrating deep and shallow networks

Chen

Geiger

and

(

2022

), “

Tensorf: tensorial radiance fields

”,

European Conference on Computer Vision

Springer

, pp.

333

350

https://doi.org/10.1109/tpami.2023.3261988

Croitoru

F.-A.

Hondru

Ionescu

R.T.

Shah

and

Intelligence

(

2023

), “

Diffusion models in vision: a survey

”,

IEEE Transactions on Pattern Analysis and Machine Intelligence

, Vol.

No.

, pp.

10850

10869

, doi:

https://doi.org/10.1109/tits.2022.3140355

Gao

Liu

Fang

Jiang

and

Huq

K.M.S.

(

2022

), “

LFT-Net: local feature transformer network for point clouds analysis

”,

IEEE Transactions on Intelligent Transportation Systems

, Vol.

No.

, pp.

2158

2168

, doi:

https://doi.org/10.3390/rs13040691

Geng

and

Zhao

(

2021

), “

Multi-scale attentive aggregation for LiDAR point cloud segmentation

”,

Remote Sensing

, Vol.

No.

, p.

691

, doi:

Guo

Wang

Bell

and

Greer

(

2003

), “

KNN model-based approach in classification

”,

On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003

Catania, Sicily, Italy

November 3-7, 2003

Springer

, pp.

986

996

Proceedings

Hamdi

Giancola

and

Ghanem

(

2021

), “

Voint cloud: multi-view point cloud representation for 3d understanding

”,

arxiv Preprint

arxiv:2111.15363

Yang

Xie

Rosa

Guo

Wang

Trigoni

and

Markham

(

2020

), “

Randla-net: efficient semantic segmentation of large-scale point clouds

”,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

, pp.

11108

11117

https://doi.org/10.1007/s11263-021-01554-9

Yang

Khalid

Xiao

Trigoni

and

Markham

(

2022

), “

Sensaturban: learning semantics from urban-scale photogrammetric point clouds

”,

International Journal of Computer Vision

, Vol.

130

No.

, pp.

316

343

, doi:

https://doi.org/10.1007/s10489-022-03930-5

Jhaldiyal

and

Chaudhary

(

2023

), “

Semantic segmentation of 3D LiDAR data using deep learning: a review of projection-based methods

”,

Applied Intelligence

, Vol.

No.

, pp.

6844

6855

, doi:

Sun

and

Chen

(

2018

), “Pointcnn: convolution on x-transformed points”, in

Advances in Neural Information Processing Systems

, Vol.

Liu

Fan

Meng

Xiang

and

Pan

(

2019

), “

Densepoint: learning densely contextual representation for efficient point cloud processing

”,

Proceedings of the IEEE/CVF International Conference on Computer Vision

, pp.

5239

5248

Lyu

Huang

and

Zhang

(

2020

), “

Learning to segment 3d point clouds in 2d image space

”,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

, pp.

12255

12264

C.R.

and

Guibas

L.J.

(

2017a

), “

Pointnet: deep learning on point sets for 3d classification and segmentation

”,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

, pp.

652

660

C.R.

and

Guibas

L.J.

(

2017b

), “Pointnet++: deep hierarchical feature learning on point sets in a metric space”, in

Advances in Neural Information Processing Systems

, Vol.

Que

and

(

2021

), “

Voxelcontext-net: an octree based framework for point cloud compression

”,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

, pp.

6042

6051

https://doi.org/10.1109/tip.2021.3073660

Shuai

and

Liu

(

2021

), “

Backward attentive fusing network with local aggregation classifier for 3D point cloud semantic segmentation

”,

IEEE Transactions on Image Processing

, Vol.

, pp.

4973

4984

, doi:

https://doi.org/10.1016/j.patcog.2021.108372

Liu

Yuan

Cheng

Zhang

Shen

and

Wang

(

2022

), “

DLA-Net: learning dual local attention features for semantic segmentation of large-scale building facade point clouds

”,

Pattern Recognition

, Vol.

123

, 108372, doi:

https://doi.org/10.1109/jsen.2023.3328603

Guang

Luo

and

Zhao

(

2023

), “

Multiview fusion driven 3-D point cloud semantic segmentation based on hierarchical transformer

”,

IEEE Sensors Journal

, Vol.

No.

, pp.

31461

31470

, doi:

Tang

Rao

Huang

Zhou

and

(

2022

), “

Point-bert: pre-training 3d point cloud transformers with masked point modeling

”,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

, pp.

19313

19322

https://doi.org/10.1016/j.eswa.2024.123269

Zeng

Xie

Tang

Wan

and

(

2024

), “

Large-scale point cloud semantic segmentation via local perception and global descriptor vector

”,

Expert Systems with Applications

, Vol.

246

, 123269, doi:

https://doi.org/10.1016/j.jag.2024.103951

Zhang

Wang

Chen

Zhang

and

Zhang

(

2024

), “

Point and voxel cross perception with lightweight CosFormer for large-scale point cloud semantic segmentation

”,

International Journal of Applied Earth Observation and Geoinformation

, Vol.

131

, 103951, doi:

https://doi.org/10.3390/rs14184471

Zhao

Liu

Ming

and

Tao

(

2022

), “

SVASeg: sparse voxel-based attention for 3D LiDAR point cloud semantic segmentation

”,

Remote Sensing

, Vol.

No.

, 4471, doi:

2025

Gang Xiao, Yangsheng Zhong, Zhipeng Wang, Sihan Ge, Qibing Wang, Feng Xu and Jiawei Lu

Figure 1

DSF-Net network model. (Source: Authors’ own creation)

Figure 2

Point-atrous spatial pyramid pooling. (Source: Authors’ own creation)

Figure 3

Deep and shallow relation-aware feature fusion (RAF). (Source: Authors’ own creation)

Figure 4

SensatUrban dataset visualization results. (Source: Authors’ own creation)

Table 1

Performance comparison experiments on S3DIS

Methods	mIoU	Total time(seconds)	GPU memory(MB)
PointNet++ (Liu et al., 2019)	57.75	45,466	4,392
SPG (Zhao et al., 2022)	60.3	56,433	1,093
PointCNN (Que et al., 2021)	56.45	86,544	10,932
RandLA-Net (Hu et al., 2020)	51.57	19,326	1,563
DSF-Net(ours)	62.1	21,854	1,242

Methods	mIoU	Total time(seconds)	GPU memory(MB)
PointNet++ (Liu et al., 2019)	57.75	45,466	4,392
SPG (Zhao et al., 2022)	60.3	56,433	1,093
PointCNN (Que et al., 2021)	56.45	86,544	10,932
RandLA-Net (Hu et al., 2020)	51.57	19,326	1,563
DSF-Net(ours)	62.1	21,854	1,242

Note(s): SPG: super point graph

Source(s): Authors’ own creation

Table 2

SensatUrban dataset experiment

Method	OA(%)	mIoU (%)	mAcc (%)	(IoU %)class
Method	OA(%)	mIoU (%)	mAcc (%)	Ground	Veg-	Building	Wall	Bridge	Parking	Rail	Traffic	Street	Car	Footpath	Bike	Water
PointNet	80.78	22.75	23.71	67.96	89.52	80.05	0	0	3.95	0	31.55	0	35.14	0	0	0
PointNet++	84.3	35.06	32.92	72.46	94.24	84.77	2.82	2.09	25.79	0	31.54	11.42	38.84	7.12	0	56.93
SegCloud	85.27	37.29	37.29	69.93	94.55	88.87	32.83	12.58	15.77	15.48	30.63	22.96	56.42	0.54	0	44.24
SPG	88.66	40.93	42.66	74.10	97.9	94.2	63.3	7.5	24.2	0	30.10	34	74.4	0	0	54.8
RandLA-Net	90.2	56.43	57.58	87.1	98.91	95.33	74.4	28.69	41.38	0	55.99	54.43	85.67	50.39	0	71.30
DSF-Net(ours)	89.78	56.9	58.69	89.11	98.07	96.58	88.40	50.45	61.62	0	66.67	53.23	86.14	39.63	0	71.31

Method	OA(%)	mIoU (%)	mAcc (%)	(IoU %)class
Method	OA(%)	mIoU (%)	mAcc (%)	Ground	Veg-	Building	Wall	Bridge	Parking	Rail	Traffic	Street	Car	Footpath	Bike	Water
PointNet	80.78	22.75	23.71	67.96	89.52	80.05	0	0	3.95	0	31.55	0	35.14	0	0	0
PointNet++	84.3	35.06	32.92	72.46	94.24	84.77	2.82	2.09	25.79	0	31.54	11.42	38.84	7.12	0	56.93
SegCloud	85.27	37.29	37.29	69.93	94.55	88.87	32.83	12.58	15.77	15.48	30.63	22.96	56.42	0.54	0	44.24
SPG	88.66	40.93	42.66	74.10	97.9	94.2	63.3	7.5	24.2	0	30.10	34	74.4	0	0	54.8
RandLA-Net	90.2	56.43	57.58	87.1	98.91	95.33	74.4	28.69	41.38	0	55.99	54.43	85.67	50.39	0	71.30
DSF-Net(ours)	89.78	56.9	58.69	89.11	98.07	96.58	88.40	50.45	61.62	0	66.67	53.23	86.14	39.63	0	71.31

Note(s): SPG: super point graph

Source(s): Authors’ own creation

Table 3

DSF-Net ablation study

D	S	Full scale	1/4 scale	mIoU	Time(s)
√			√	32.05	444,300
√		√		66.26	543,468
	√		√	46.15	112,354
	√	√		56.15	436,542
√	√		√	59.05	239,080
√	√	√		65.15	346,752

Source(s): Authors’ own creation

Table 4

RAF ablation study

ADD	RAF	mIoU(%)	Total time(Seconds)
√		45.26	45,832
	√	62.15	23,548

Source(s): Authors’ own creation

Armeni

Sax

Zamir

A.R.

and

Savarese

(

2017

), “

Joint 2d-3d-semantic data for indoor scene understanding

”,

arxiv Preprint

arxiv:1702.01105

Chen

Geiger

and

(

2022

), “

Tensorf: tensorial radiance fields

”,

European Conference on Computer Vision

Springer

, pp.

333

350

https://doi.org/10.1109/tpami.2023.3261988

Croitoru

F.-A.

Hondru

Ionescu

R.T.

Shah

and

Intelligence

(

2023

), “

Diffusion models in vision: a survey

”,

IEEE Transactions on Pattern Analysis and Machine Intelligence

, Vol.

No.

, pp.

10850

10869

, doi:

https://doi.org/10.1109/tits.2022.3140355

Gao

Liu

Fang

Jiang

and

Huq

K.M.S.

(

2022

), “

LFT-Net: local feature transformer network for point clouds analysis

”,

IEEE Transactions on Intelligent Transportation Systems

, Vol.

No.

, pp.

2158

2168

, doi:

https://doi.org/10.3390/rs13040691

Geng

and

Zhao

(

2021

), “

Multi-scale attentive aggregation for LiDAR point cloud segmentation

”,

Remote Sensing

, Vol.

No.

, p.

691

, doi:

Guo

Wang

Bell

and

Greer

(

2003

), “

KNN model-based approach in classification

”,

On The Move to Meaningful Internet Systems 2003: CoopIS, DOA, and ODBASE: OTM Confederated International Conferences, CoopIS, DOA, and ODBASE 2003

Catania, Sicily, Italy

November 3-7, 2003

Springer

, pp.

986

996

Proceedings

Hamdi

Giancola

and

Ghanem

(

2021

), “

Voint cloud: multi-view point cloud representation for 3d understanding

”,

arxiv Preprint

arxiv:2111.15363

Yang

Xie

Rosa

Guo

Wang

Trigoni

and

Markham

(

2020

), “

Randla-net: efficient semantic segmentation of large-scale point clouds

”,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

, pp.

11108

11117

https://doi.org/10.1007/s11263-021-01554-9

Yang

Khalid

Xiao

Trigoni

and

Markham

(

2022

), “

Sensaturban: learning semantics from urban-scale photogrammetric point clouds

”,

International Journal of Computer Vision

, Vol.

130

No.

, pp.

316

343

, doi:

https://doi.org/10.1007/s10489-022-03930-5

Jhaldiyal

and

Chaudhary

(

2023

), “

Semantic segmentation of 3D LiDAR data using deep learning: a review of projection-based methods

”,

Applied Intelligence

, Vol.

No.

, pp.

6844

6855

, doi:

Sun

and

Chen

(

2018

), “Pointcnn: convolution on x-transformed points”, in

Advances in Neural Information Processing Systems

, Vol.

Liu

Fan

Meng

Xiang

and

Pan

(

2019

), “

Densepoint: learning densely contextual representation for efficient point cloud processing

”,

Proceedings of the IEEE/CVF International Conference on Computer Vision

, pp.

5239

5248

Lyu

Huang

and

Zhang

(

2020

), “

Learning to segment 3d point clouds in 2d image space

”,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

, pp.

12255

12264

C.R.

and

Guibas

L.J.

(

2017a

), “

Pointnet: deep learning on point sets for 3d classification and segmentation

”,

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition

, pp.

652

660

C.R.

and

Guibas

L.J.

(

2017b

), “Pointnet++: deep hierarchical feature learning on point sets in a metric space”, in

Advances in Neural Information Processing Systems

, Vol.

Que

and

(

2021

), “

Voxelcontext-net: an octree based framework for point cloud compression

”,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

, pp.

6042

6051

https://doi.org/10.1109/tip.2021.3073660

Shuai

and

Liu

(

2021

), “

Backward attentive fusing network with local aggregation classifier for 3D point cloud semantic segmentation

”,

IEEE Transactions on Image Processing

, Vol.

, pp.

4973

4984

, doi:

https://doi.org/10.1016/j.patcog.2021.108372

Liu

Yuan

Cheng

Zhang

Shen

and

Wang

(

2022

), “

DLA-Net: learning dual local attention features for semantic segmentation of large-scale building facade point clouds

”,

Pattern Recognition

, Vol.

123

, 108372, doi:

https://doi.org/10.1109/jsen.2023.3328603

Guang

Luo

and

Zhao

(

2023

), “

Multiview fusion driven 3-D point cloud semantic segmentation based on hierarchical transformer

”,

IEEE Sensors Journal

, Vol.

No.

, pp.

31461

31470

, doi:

Tang

Rao

Huang

Zhou

and

(

2022

), “

Point-bert: pre-training 3d point cloud transformers with masked point modeling

”,

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition

, pp.

19313

19322

https://doi.org/10.1016/j.eswa.2024.123269

Zeng

Xie

Tang

Wan

and

(

2024

), “

Large-scale point cloud semantic segmentation via local perception and global descriptor vector

”,

Expert Systems with Applications

, Vol.

246

, 123269, doi:

https://doi.org/10.1016/j.jag.2024.103951

Zhang

Wang

Chen

Zhang

and

Zhang

(

2024

), “

Point and voxel cross perception with lightweight CosFormer for large-scale point cloud semantic segmentation

”,

International Journal of Applied Earth Observation and Geoinformation

, Vol.

131

, 103951, doi:

https://doi.org/10.3390/rs14184471

Zhao

Liu

Ming

and

Tao

(

2022

), “

SVASeg: sparse voxel-based attention for 3D LiDAR point cloud semantic segmentation

”,

Remote Sensing

, Vol.

No.

, 4471, doi: