Unstructured data refers to information that is not organized in a specific, predefined way that is easily understood by computers. In the past, most computer systems required data to be highly structured in order to be processed, often using methods like tables, rows, and fields with specific data types. However, real-world information is often more complex and does not fit neatly into these structures. As a result, modern information technologies such as artificial intelligence are able to process unstructured data, which is more common in the real world.
The following are examples.
- Writing: Textual analysis of written works such as books and blogs.
- Social Media: Scanning streams of social media to detect real time information such as rumors about a stock.
- Natural Language: Systems that accept voice commands or understand what people are saying for purposes such as analytics.
- Photographs & Video: Analysis of video to understand events such as a video camera that monitors water levels flowing into a dam reservoir.
- Communications: Scanning communications such as emails to detect spam.
- Science: Looking for patterns in interstellar radio messages in order to discover intelligent life.
- Health: Analysis of x-ray images for signs of disease.
- Search: A search engine that spiders unstructured web pages in order to understand their content.