LUMINOUS: Difference between revisions

Latest revision as of 13:09, 22 April 2026

LUMINOUS Project

CORDIS Reference	Start date	End date	Coordinator
https://cordis.europa.eu/project/id/101135724	01/01/2024	31/12/2026	DFKI / Germany

Project description

LUMINOUS aims at the creation of the next generation of Language Augmented XR systems, where natural language-based communication and Multimodal Large Language Models (MLLM) enable adaptation to individual, not predefined user needs and unseen environments. This will enable future XR users to interact fluently with their environment, while having instant access to constantly updated global as well as domain- specific knowledge sources to accomplish novel tasks. We aim to exploit MLLMs injected with domain specific knowledge for describing novel tasks on user demand. These are then communicated through a speech interface and/or a task adaptable avatar (e.g. coach/teacher) in terms of different visual aids and procedural steps for the accomplishment of the task. Language driven specification of the style, facial expressions, and specific attitudes of virtual avatars will facilitate generalisable and situation-aware communication in multiple use cases and different sectors. LLMs will benefit in parallel in identifying new objects that were not part of their training data and then describing them in a way that they become visually recognizable. Our results will be prototyped and tested in three pilots, focussing on neurorehabilitation (support of stroke patients with language impairments), immersive industrial safety training, and 3D architectural design review. A consortium of six leading R&D institutes experts in six different disciplines (AI, Augmented Vision, NLP, Computer Graphics, Neurorehabilitation, Ethics) will follow a challenging workplan, aiming to bring about a new era at the crossroads of two of the most promising current technological developments (LLM/AI and XR), made in Europe.

Project outputs

Publications

Domain	Type of output	Title	DOI URL
AI, Machine Learning & Data Science	Peer reviewed articles	Next Generation XR Systems—Large Language Models Meet Augmented and Virtual Reality	https://doi.org/10.1109/MCG.2025.3548554
Computer Vision, 3D Modeling & Rendering	Conference proceedings	Vision-Language Models Struggle to Align Entities across Modalities	https://doi.org/10.18653/V1/2025.FINDINGS-ACL.965
Computer Vision, 3D Modeling & Rendering	Conference proceedings	Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection	https://doi.org/10.1109/CVPR52733.2024.00558
Computer Vision, 3D Modeling & Rendering	Conference proceedings	PixT3: Pixel-based Table-To-Text Generation	https://doi.org/10.18653/V1/2024.ACL-LONG.364
Computer Vision, 3D Modeling & Rendering	Conference proceedings	MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation	https://doi.org/10.1109/CVPR52734.2025.00759
Computer Vision, 3D Modeling & Rendering	Conference proceedings	Compact 3D Scene Representation via Self-Organizing Gaussian Grids	https://doi.org/10.1007/978-3-031-73013-9_2
Computer Vision, 3D Modeling & Rendering	Conference proceedings	Realtime-Rendering of Dynamic Scenes with Neural Radiance Fields	https://doi.org/10.1109/VRW66409.2025.00345
Computer Vision, 3D Modeling & Rendering	Conference proceedings	Improving Adaptive Density Control for 3D Gaussian Splatting	https://doi.org/10.5220/0013308500003912
Computer Vision, 3D Modeling & Rendering	Conference proceedings	Gaussian Splatting Decoder for 3D-aware Generative Adversarial Networks	https://doi.org/10.1109/CVPRW63382.2024.00794
Computer Vision, 3D Modeling & Rendering	Conference proceedings	Multi-Resolution Generative Modeling of Human Motion from Limited Data	https://doi.org/10.1145/3697294.3697309
Computer Vision, 3D Modeling & Rendering	Peer reviewed articles	3DGS.zip: A survey on 3D Gaussian Splatting Compression Methods	https://doi.org/10.1111/CGF.70078

Technological assets

Title	Type of Asset	Link / DOI	Description
Text2CAD	AI Model	https://luminous-horizon.eu/index.php/blogs/introducing-text2cad-revolutionizing-cad-generation-from-text-prompts-for-next-gen-xr-in-luminous/	A generative model capable of producing sequential CAD designs from text prompts.
MARVEL-40M+	AI Model / Framework	https://doi.org/10.1109/CVPR52734.2025.00759	A multi-level visual elaboration framework designed for high-fidelity text-to-3D content creation.

@@ Line 8: / Line 8: @@
 === Project description ===
 LUMINOUS aims at the creation of the next generation of Language Augmented XR systems, where natural language-based communication and Multimodal Large Language Models (MLLM) enable adaptation to individual, not predefined user needs and unseen environments. This will enable future XR users to interact fluently with their environment, while having instant access to constantly updated global as well as domain- specific knowledge sources to accomplish novel tasks. We aim to exploit MLLMs injected with domain specific knowledge for describing novel tasks on user demand. These are then communicated through a speech interface and/or a task adaptable avatar (e.g. coach/teacher) in terms of different visual aids and procedural steps for the accomplishment of the task. Language driven specification of the style, facial expressions, and specific attitudes of virtual avatars will facilitate generalisable and situation-aware communication in multiple use cases and different sectors. LLMs will benefit in parallel in identifying new objects that were not part of their training data and then describing them in a way that they become visually recognizable. Our results will be prototyped and tested in three pilots, focussing on neurorehabilitation (support of stroke patients with language impairments), immersive industrial safety training, and 3D architectural design review. A consortium of six leading R&D institutes experts in six different disciplines (AI, Augmented Vision, NLP, Computer Graphics, Neurorehabilitation, Ethics) will follow a challenging workplan, aiming to bring about a new era at the crossroads of two of the most promising current technological developments (LLM/AI and XR), made in Europe.
+=== Project outputs ===
+==== Publications ====
+{| class="wikitable sortable"
+! Domain !! Type of output !! Title !! DOI URL
+|-
+| AI, Machine Learning & Data Science || Peer reviewed articles || Next Generation XR Systems—Large Language Models Meet Augmented and Virtual Reality || https://doi.org/10.1109/MCG.2025.3548554
+|-
+| Computer Vision, 3D Modeling & Rendering || Conference proceedings || Vision-Language Models Struggle to Align Entities across Modalities || https://doi.org/10.18653/V1/2025.FINDINGS-ACL.965
+|-
+| Computer Vision, 3D Modeling & Rendering || Conference proceedings || Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection || https://doi.org/10.1109/CVPR52733.2024.00558
+|-
+| Computer Vision, 3D Modeling & Rendering || Conference proceedings || PixT3: Pixel-based Table-To-Text Generation || https://doi.org/10.18653/V1/2024.ACL-LONG.364
+|-
+| Computer Vision, 3D Modeling & Rendering || Conference proceedings || MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation || https://doi.org/10.1109/CVPR52734.2025.00759
+|-
+| Computer Vision, 3D Modeling & Rendering || Conference proceedings || Compact 3D Scene Representation via Self-Organizing Gaussian Grids || https://doi.org/10.1007/978-3-031-73013-9_2
+|-
+| Computer Vision, 3D Modeling & Rendering || Conference proceedings || Realtime-Rendering of Dynamic Scenes with Neural Radiance Fields || https://doi.org/10.1109/VRW66409.2025.00345
+|-
+| Computer Vision, 3D Modeling & Rendering || Conference proceedings || Improving Adaptive Density Control for 3D Gaussian Splatting || https://doi.org/10.5220/0013308500003912
+|-
+| Computer Vision, 3D Modeling & Rendering || Conference proceedings || Gaussian Splatting Decoder for 3D-aware Generative Adversarial Networks || https://doi.org/10.1109/CVPRW63382.2024.00794
+|-
+| Computer Vision, 3D Modeling & Rendering || Conference proceedings || Multi-Resolution Generative Modeling of Human Motion from Limited Data || https://doi.org/10.1145/3697294.3697309
+|-
+| Computer Vision, 3D Modeling & Rendering || Peer reviewed articles || 3DGS.zip: A survey on 3D Gaussian Splatting Compression Methods || https://doi.org/10.1111/CGF.70078
+|}
+==== Technological assets ====
+{| class="wikitable sortable"
+! Title !! Type of Asset !! Link / DOI !! Description
+|-
+| Text2CAD || AI Model || https://luminous-horizon.eu/index.php/blogs/introducing-text2cad-revolutionizing-cad-generation-from-text-prompts-for-next-gen-xr-in-luminous/ || A generative model capable of producing sequential CAD designs from text prompts.
+|-
+| MARVEL-40M+ || AI Model / Framework || https://doi.org/10.1109/CVPR52734.2025.00759 || A multi-level visual elaboration framework designed for high-fidelity text-to-3D content creation.
+|}