我正试图找到最有效的方法来解决这个问题,但我必须告诉你,我已经搞砸了它.环顾四周,没有发现任何相关性,所以在这里.
如何选择与所需项目具有相似标签的所有项目?
- project 1 -> tagA | tagB | tagC
- project 2 -> tagA | tagB
- project 3 -> tagA
- project 4 -> tagC
选择项目1应返回所有项目.
选择项目4应仅返回项目项目1
到目前为止,我的查询非常依赖于左连接,并且肯定有更好的方法来执行此操作:
- SELECT all_tags.project_id,all_tags.tag_id,final.title,tag.tag
- FROM projects AS p
- LEFT JOIN projects_to_tags AS pt ON p.num = pt.project_id
- LEFT JOIN projects_to_tags AS all_tags ON pt.tag_id = all_tags.tag_id
- LEFT JOIN projects AS final ON all_tags.project_id = final.num
- LEFT JOIN tags AS tag ON all_tags.tag_id = tag.tag_id
- WHERE p.num = 4
- GROUP BY final.num
谢谢大家的意见.我虽然与大家分享了100k项目数据库上所有查询的平均结果,100k标签数据库与100k projects_to_tags关系.所有查询都已更改为要求project_1.
甜蜜和短暂:
- 0.0160 sec - OMG Ponies - Using JOINS
- 0.0208 sec - jdelard
- 0.2581 sec - OMG Ponies - Using EXISTS
- 0.2777 sec - OMG Ponies - Using IN
- 0.5295 sec - Emtucifor - updated query
- 0.5088 sec - Emtucifor - first query
非常感谢大家.将相应地更新我的所有查询.
- ===============================================================================================================================================
- Emtucifor - updated query
- ===============================================================================================================================================
- Showing rows 0 - 1 (2 total,Query took 0.5295 sec)
- SELECT *
- FROM projects AS L
- WHERE L.num !=1-- instead of <> PT2.project_id inside
- AND EXISTS (
- SELECT 1
- FROM projects_to_tags PT
- INNER JOIN projects_to_tags PT2 ON PT.tag_id = PT2.tag_id
- WHERE L.num = PT.project_id
- AND PT2.project_id =1
- )
- LIMIT 0,30
- id select_type table type possible_keys key key_len ref rows Extra
- 1 PRIMARY L ALL PRIMARY NULL NULL NULL 100000 Using where
- 2 DEPENDENT SUBQUERY PT2 ref project_id project_id 4 const 1 Using index
- 2 DEPENDENT SUBQUERY PT ref project_id project_id 8 test.L.num,test.PT2.tag_id 12000 Using index
- ===============================================================================================================================================
- Emtucifor - first query
- ===============================================================================================================================================
- Showing rows 0 - 1 (2 total,Query took 0.5088 sec)
- SELECT *
- FROM projects AS L
- WHERE
- EXISTS (
- SELECT 1
- FROM projects_to_tags PT
- INNER JOIN projects_to_tags PT2 ON PT.tag_id = PT2.tag_id
- WHERE L.num = PT.project_id
- AND PT2.project_id =1
- AND PT2.project_id <> L.num
- )
- LIMIT 0,30
- id select_type table type possible_keys key key_len ref rows Extra
- 1 PRIMARY L ALL NULL NULL NULL NULL 100000 Using where
- 2 DEPENDENT SUBQUERY PT2 ref project_id project_id 4 const 1 Using index
- 2 DEPENDENT SUBQUERY PT ref project_id project_id 8 test.L.num,test.PT2.tag_id 12000 Using where; Using index
- ===============================================================================================================================================
- jdelard
- ===============================================================================================================================================
- Showing rows 0 - 1 (2 total,Query took 0.0208 sec)
- SELECT p.num,p.title
- FROM projects_to_tags pt1,projects_to_tags pt2,projects p
- WHERE pt1.project_id =1
- AND pt2.project_id !=1
- AND pt1.tag_id = pt2.tag_id
- AND p.num = pt2.project_id
- GROUP BY pt2.project_id
- LIMIT 0,30
- id select_type table type possible_keys key key_len ref rows Extra
- 1 SIMPLE pt1 ref project_id project_id 4 const 1 Using index; Using temporary; Using filesort
- 1 SIMPLE pt2 index project_id project_id 8 NULL 75001 Using where; Using index
- 1 SIMPLE p eq_ref PRIMARY PRIMARY 4 test.pt2.project_id 1
- ===============================================================================================================================================
- OMG Ponies - Using IN
- ===============================================================================================================================================
- Showing rows 0 - 2 (3 total,Query took 0.2777 sec)
- SELECT p . *
- FROM projects p
- JOIN projects_to_tags pt ON pt.project_id = p.num
- WHERE pt.tag_id
- IN (
- SELECT x.tag_id
- FROM projects_to_tags x
- WHERE x.project_id =1
- )
- LIMIT 0,30
- id select_type table type possible_keys key key_len ref rows Extra
- 1 PRIMARY pt index project_id project_id 8 NULL 100001 Using where; Using index
- 1 PRIMARY p eq_ref PRIMARY PRIMARY 4 test.pt.project_id 1
- 2 DEPENDENT SUBQUERY x ref project_id project_id 8 const,func 12000 Using where; Using index
- ===============================================================================================================================================
- OMG Ponies - Using EXISTS
- ===============================================================================================================================================
- Showing rows 0 - 2 (3 total,Query took 0.2581 sec)
- SELECT p . *
- FROM projects p
- JOIN projects_to_tags pt ON pt.project_id = p.num
- WHERE EXISTS (
- SELECT NULL
- FROM projects_to_tags x
- WHERE x.project_id = 1
- AND x.tag_id = pt.tag_id
- )
- LIMIT 0,30
- ===============================================================================================================================================
- OMG Ponies - Using JOINS
- ===============================================================================================================================================
- Showing rows 0 - 2 (3 total,Query took 0.0160 sec)
- SELECT DISTINCT p . *
- FROM projects p
- JOIN projects_to_tags pt ON pt.project_id = p.num
- JOIN projects_to_tags x ON x.tag_id = pt.tag_id
- AND x.project_id = 1
- LIMIT 0,30
- id select_type table type possible_keys key key_len ref rows Extra
- 1 SIMPLE x ref project_id project_id 4 const 1 Using index; Using temporary
- 1 SIMPLE pt index project_id project_id 8 NULL 75001 Using where; Using index
- 1 SIMPLE p eq_ref PRIMARY PRIMARY 4 test.pt.project_id 1
- CREATE TABLE IF NOT EXISTS `projects` (
- `num` int(2) NOT NULL auto_increment,`title` varchar(30) NOT NULL,PRIMARY KEY (`num`)
- ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=5 ;
- INSERT INTO `projects` (`num`,`title`) VALUES(1,'project 1'),(2,'project 2'),(3,'project 3'),(4,'project 4');
- CREATE TABLE IF NOT EXISTS `projects_to_tags` (
- `project_id` int(2) NOT NULL,`tag_id` int(2) NOT NULL,KEY `project_id` (`project_id`,`tag_id`)
- ) ENGINE=MyISAM DEFAULT CHARSET=latin1;
- INSERT INTO `projects_to_tags` (`project_id`,`tag_id`) VALUES(1,1),(1,2),3),3);
- CREATE TABLE IF NOT EXISTS `tags` (
- `tag_id` int(2) NOT NULL auto_increment,`tag` varchar(30) NOT NULL,PRIMARY KEY (`tag_id`),UNIQUE KEY `tag` (`tag`)
- ) ENGINE=MyISAM DEFAULT CHARSET=latin1 AUTO_INCREMENT=4 ;
- INSERT INTO `tags` (`tag_id`,`tag`) VALUES(1,'tag a'),'tag b'),'tag c');
解决方法
在以下任何一种情况下,如果您不知道PROJECT.num / PROJECT_TO_TAGS.project_id,则必须加入PROJECTS表以获取id值以找出它关联的标记.
使用IN
- SELECT p.*
- FROM PROJECTS p
- JOIN PROJECTS_TO_TAGS pt ON pt.project_id = p.num
- WHERE pt.tag_id IN (SELECT x.tag_id
- FROM PROJECTS_TO_TAGS x
- WHERE x.project_id = 4)
使用EXISTS
- SELECT p.*
- FROM PROJECTS p
- JOIN PROJECTS_TO_TAGS pt ON pt.project_id = p.num
- WHERE EXISTS (SELECT NULL
- FROM PROJECTS_TO_TAGS x
- WHERE x.project_id = 4
- AND x.tag_id = pt.tag_id)
使用JOINS(这是最有效的!)
DISTINCT是必要的,因为JOIN会冒结果集中出现重复数据的风险……
- SELECT DISTINCT p.*
- FROM PROJECTS p
- JOIN PROJECTS_TO_TAGS pt ON pt.project_id = p.num
- JOIN PROJECTS_TO_TAGS x ON x.tag_id = pt.tag_id
- AND x.project_id = 4