| | --- |
| | license: mit |
| | language: |
| | - en |
| | tags: |
| | - schema |
| | - word-embeddings |
| | - embeddings |
| | - unsupervised-learning |
| | - tables |
| | - web-table |
| | - schema-data |
| | --- |
| | # Pre-trained Web Table Embeddings |
| |
|
| | The models here represent schema terms and instance data terms in a semantic vector space making them especially useful for representing schema and class information as well as for ML tasks on tabular text data. |
| |
|
| | The code for executing and evaluating the models is located in the [table-embeddings Github repository](https://github.com/guenthermi/table-embeddings) |
| |
|
| | ## Quick Start |
| |
|
| | You can install the table_embeddings package to encode text from tables by running the following commands: |
| | |
| | |
| | ```bash |
| | pip install cython |
| | pip install git+https://github.com/guenthermi/table-embeddings.git |
| | ``` |
| | |
| | After that you can encode text with the following Python snippet: |
| | |
| | ```python |
| | from table_embeddings import TableEmbeddingModel |
| | model = TableEmbeddingModel.load_model('ddrg/web_table_embeddings_combo64') |
| | embedding = model.get_header_vector('headline') |
| | ``` |
| | |
| | ## Model Types |
| | |
| | | Model Type | Description | Download-Links | |
| | | ---------- | ----------- | -------------- | |
| | | W-tax | Model of relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_tax64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_tax150)) |
| | | W-row | Model of row-wise relations in tables | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_row64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_row150)) |
| | | W-combo | Model of row-wise relations and relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_combo64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_combo150)) |
| | | W-plain | Model of row-wise relations in tables without pre-processing | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_plain64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_plain150)) |
| | |
| | ## More Information |
| | |
| | For examples on how to use the models, you can take a look at the [Github repository](https://github.com/guenthermi/table-embeddings) |
| | |
| | More information can be found in the paper [Pre-Trained Web Table Embeddings for Table Discovery](https://dl.acm.org/doi/10.1145/3464509.3464892) |
| | ``` |
| | @inproceedings{gunther2021pre, |
| | title={Pre-Trained Web Table Embeddings for Table Discovery}, |
| | author={G{\"u}nther, Michael and Thiele, Maik and Gonsior, Julius and Lehner, Wolfgang}, |
| | booktitle={Fourth Workshop in Exploiting AI Techniques for Data Management}, |
| | pages={24--31}, |
| | year={2021} |
| | } |
| | ``` |