Page MenuHomePhabricator

[SPIKE] Result thumbnails questions
Closed, ResolvedPublic

Assigned To
None
Authored By
Seddon
Apr 27 2022, 7:19 PM
Referenced Files
F35155882: Special_Search.png
May 19 2022, 7:04 PM
F35155822: Special_Search.png
May 19 2022, 7:02 PM
F35155825: Special_Search.png
May 19 2022, 7:02 PM
F35152066: Thumbnails.png
May 18 2022, 7:42 PM
F35152051: Thumbnails.png
May 18 2022, 7:41 PM

Description

Questions
  1. (design) What is the thumbnail on-click and on-hover behaviour?
  2. (design/technical)How can we present the image in the thumbnail?
    • Not all images are square.
    • Will the image be fixed width?
    • How will the image adjust for snippet length?
    • Fixed height?
    • Min-height?
    • Will the image be presented centered?
  3. (technical) - Is the thumbnail data already available in search results or do we need to pull it in via the api?
  4. (technical) - How many articles have images? And are we fine if the majority don't have any?
  5. (technical/design) - What do we want to do for non-content namespaces?
  6. (technical) - What is the optimal resolution for likely-warm-in-cache to maximize image delivery? (note: check MediaInfo, optimal sizes are somewhere in there already)

Note: We're using the same thumbnails that are used in the Go bar autocomplete

Event Timeline

Seddon updated the task description. (Show Details)
Seddon updated the task description. (Show Details)
Seddon updated the task description. (Show Details)
Seddon updated the task description. (Show Details)

Luckily, figuring out what would be a good thumbnail for a page is a "solved problem" in the form of Extension:PageImages.
It's enabled on most wikis, with the exception of wikibooks, wikisource, wikitech & lockeddown (loginwiki, votewiki). Is also (usually) only works with the main namespace.

Findings:

  • These are already used in the MW REST API (T250144) & New Vector's fancier search, and are used as the article's Open Graph images
  • This is not built-in (into MediaWiki core) behavior. It requires an extension (not present on all wikis), so we're going to need the ability to turn this off completely.
  • Usually, only articles in the main namespace have a known thumbnail. We'll need to consider (largely) thumbnail-less searches when users get lots content from other namespaces.
  • Even within the main namespace, the majority of articles have no thumbnail. I'm willing to bet that there's a good correlation between <prevalence of articles in search results> and <likeliness to have an image> because both tend to increase with more article content. I expect the best matches to be most likely to have thumbnails. Still, lesser matches (or more obscure searches) will likely yield many thumbnail-less results.

Pinging @Sneha for FYI.

TL;DR: I expect the majority of searches to have pretty decent thumbnail coverage for the better matches. But there are still going to be many cases (far too much to ignore) where we will have little or no results with thumbnails.

I have run some numbers (see below) on the thumbnail availability on a wide random selection of projects/languages/namespaces if you're interested in more details:


enwiki

namespacepages totalpages with thumb
0166008962813768
188393520
238227430
3178199460
413467760
51822510
69299090
76966060
822980
918780
107353400
114708130
1225940
1315490
14rMW2164406f241f0
1517899290
1001053530
101263840
1181869850
119629760
71012790
7111500
828148040
829101400
230010
230110

dewiki

namespacepages totalpages with thumb
043991041127009
18042990
25582120
38049890
41023250
5197140
61308770
726710
828030
92720
10857790
1163430
1211270
135520
144763040
1599840
100207620
10164480
8288540
8294520

hewiki

namespacepages totalpages with thumb
0534359220062
11564410
2646420
32264260
4238570
533710
6731370
720270
811100
94340
10440370
11161640
122520
13930
141156710
1567320
100429350
10112480
1082100
109190
11843650
1196140
8282860
829530
260028340

ptwiki

namespacepages totalpages with thumb
01865300640470
15289020
21548350
321272590
41159980
558560
6582520
71150
877480
92080
10908380
1132000
127240
131250
143639120
1518890
100236320
1018700
1043350
1051370
447260
710370
82817740
829830
260035830

ruwiki

namespacepages totalpages with thumb
041608131109884
18207650
21543030
38409530
4746090
530540
62375120
710900
819280
92290
101882630
1198980
12740
13480
145222800
1573440
100227140
1017980
1021010
1031150
104208670
10557990
10632830
10712700
82819180
8291430

plwiki

namespacepages totalpages with thumb
02036870892944
15215760
21454670
33431230
4932050
519820
613250
724870
87710
9590
10621930
1125850
124210
131050
142726140
1546100
10069520
1013930
102125540
10360120
82873550
829410
260044300

enwiktionary

namespacepages totalpages with thumb
0708895678708
1783610
2365200
3576880
4118910
516420
6240
7350
810250
91400
10450250
1138200
121500
13660
146228870
1526480
9084710
9110
92120
9310
100233520
10117490
1022330
103140
106258140
1071490
10820
109340
11048760
1111510
114349910
11530
1162050
118295520
11926540
828557810
82910420

dewiktionary

namespacepages totalpages with thumb
0104645926747
1157220
265670
3162970
418490
53140
61020
7310
88090
9570
1067110
114580
125830
13950
1486750
151080
10216200
1032500
106444290
1072710
108637020
1092180
110430
11110
8281250
829190

hewiktionary

namespacepages totalpages with thumb
0286327041
130260
214480
365590
44400
5890
62860
740
85940
9330
1014940
111110
12280
1320
1427400
151090
100860
101280
828320

ptwiktionary

namespacepages totalpages with thumb
027111926222
124300
225270
3646680
47440
51120
720
813750
9390
1082900
111220
121070
13190
14358790
151420
1002910
101240
1021010
10330
10410
106150
10849280
828570

ruwiktionary

namespacepages totalpages with thumb
0202846239463
170590
2112880
3797470
453290
51960
61600
750
81840
9220
10372920
116670
12100
1330
142901420
157930
10010720
101820
102160
10310
10425480
105300
106350
10740
8285030
829120
230110
230310

plwiktionary

namespacepages totalpages with thumb
0763265100948
131720
262710
347160
410020
5600
6460
710
85440
9150
1055350
111900
12150
1340
1465270
15560
1009400
101480
10224890
103710
1041070
105200
828770
82940

enwikiquote

namespacepages totalpages with thumb
07299120511
1104060
2110370
3667370
4102030
53680
720
81080
9300
1016600
111270
121090
131000
14124430
151420
828930
82920

dewikiquote

namespacepages totalpages with thumb
088912820
117940
220640
347470
43870
5680
6200
730
81900
9120
102480
11410
12190
1350
1416700
15300
10070
10140

hewikiquote

namespacepages totalpages with thumb
083335216
18550
29020
315880
43240
5620
64670
790
83070
9260
1018050
11530
12480
1340
1411940
15450
10110
82830

ptwikiquote

namespacepages totalpages with thumb
0121275214
13110
223170
3116800
44430
5140
6230
83110
9300
1012690
11130
12380
1424240
15120
828560

ruwikiquote

namespacepages totalpages with thumb
0273625048
17820
214780
333670
43020
5200
710
81700
960
109630
11160
12220
1360
1451650
15250
828390

plwikiquote

namespacepages totalpages with thumb
03359515460
112890
214370
391370
41580
5270
610
8810
920
106390
1180
12140
1459460
15250
82810

enwikinews

namespacepages totalpages with thumb
04287016095
1235460
2136020
327138240
4117690
57170
660810
71930
84550
9910
1044630
114570
12740
1370
14114490
154240
9093200
92210
10042290
1011120
102123960
828290
82920

dewikinews

namespacepages totalpages with thumb
0142045637
1124920
223380
341230
440680
52170
6650
770
82170
9240
1010100
111260
12420
13360
14173830
152720
1006570
1011160
10234980
82890

hewikinews

namespacepages totalpages with thumb
01285411
11230
26090
35230
41160
5290
6490
710
81570
9190
104260
11280
12300
1310
1414520
15110
100210
10190
828210

ptwikinews

namespacepages totalpages with thumb
02777521389
111600
217330
398940
486340
51200
660
740
85930
9110
1017220
11790
12970
13100
14119750
15320
1001280
101130
10410
8281150
82910

ruwikinews

namespacepages totalpages with thumb
036957971489483
121600
215400
353970
467857100
5620
6250
720
81750
9150
10187020
11380
12370
1320
1419192540
151210
1003030
10160
10215058250
10310
8282970
829140

plwikinews

namespacepages totalpages with thumb
03144112882
13540
213570
320810
42110
5150
612700
720
81440
960
1015720
11310
12280
1330
1484100
15150
1008660
101290
828270
82910

enwikiversity

namespacepages totalpages with thumb
06022313548
1106870
2287600
3493770
419500
55440
6366330
71240
82220
9520
1089910
114530
122620
13660
1480480
151280
1005630
1012350
10247310
1033650
1045740
1057730
106500
10710
1189690
1191500
8284540
829280
230310

dewikiversity

namespacepages totalpages with thumb
0462541773
15860
272010
3139040
48990
5970
630560
7530
81980
9200
1030670
11710
12260
1350
14167190
15230
106260750
1079150
10820660
1092980
828670

ptwikiversity

namespacepages totalpages with thumb
075751375
16380
223770
359290
42920
5330
61150
740
8720
9150
107170
1140
12280
1370
148190
1560
828650

ruwikiversity

namespacepages totalpages with thumb
06122559
15900
224210
367910
43130
5460
66590
8340
940
1023730
11140
12130
1320
1415040
1570
10028629
10160
102850
103370
828320

enwikivoyage

namespacepages totalpages with thumb
06033018187
1141470
2419500
3340040
421420
58860
617620
73290
81820
9660
1016490
113970
12120
1320
1445800
15560
8281260
829190

dewikivoyage

namespacepages totalpages with thumb

+-----------+-------------+------------------+

03029417462
125530
253110
3426680
413860
52170
67140
7380
82470
9210
1036940
114160
122440
13470
1462240
15230
1001350
101150
102400
10360
10619220
107260
82816610
829340

hewikivoyage

namespacepages totalpages with thumb
046712336
128130
29890
311070
411310
51790
62290
720
81720
9570
1026400
11370
12860
1360
1411420
1510
1082320
1103480
828450

ptwikivoyage

namespacepages totalpages with thumb
045271719
13310
24880
319540
43560
5420
610
8610
970
103540
11230
1220
1410430
828510

ruwikivoyage

namespacepages totalpages with thumb
072855986
114660
230570
333780
43630
51010
6112620
7340
81280
9150
106340
11500
1210
144390
1530
828320
82920

plwikivoyage

namespacepages totalpages with thumb
01306411262
1560
24080
32700
41370
5130
8480
930
102520
1150
1240
145990
828440

commonswiki

namespacepages totalpages with thumb
0236023121759
129320
23600290
3109958920
411191890
5110760
6847523680
73311530
8213910
939400
102703000
1143560
1210110
13940
14113338402562221
15287920
100556470
1011550
102162800
1034840
104250
10520
10681730
107230
4607440
461300
486593720
487246990
4901600
82812780
8292200
11983735320
119960
2600190

wikidatawiki

namespacepages totalpages with thumb
01008323444602017
1377370
2574550
3800780
4649840
5125850
740
835830
94030
1098470
112780
1225900
131670
1443540
15180
1201002718
121101300
1466676095861
1472760
6403410
641780
8287200
829840
11982190730
119930
2600244570

specieswiki

namespacepages totalpages with thumb
01277830154732
1420130
257580
3139910
411910
5580
710
840110
91180
102611700
113560
123680
13170
14566450
15590
8282350
82970
1198226030
119910
2600580

If anyone is interested in numbers from another wiki, LMK and I can run this query again:

SELECT page_namespace AS namespace, COUNT(page_id) AS 'pages total', COUNT(DISTINCT pp_page) AS 'pages with thumb'
FROM page
LEFT JOIN page_props ON pp_page = page_id AND pp_propname IN ('page_image', 'page_image_free')
GROUP BY page_namespace;

Questions/answers:

Again pinging @Sneha in case any of below is relevant.

  1. How can we present the image in the thumbnail?

Mostly a design question. We can probably stretch/fill any way we want, depending on design preferences.

  1. Is the thumbnail data already available in search results or do we need to pull it in via the api?

They're in Extension:PageImages. We can access them via API or get them from PageImages::getPageImage(). They live in an extension, though, so we'd probably need to use some Hooks mechanism to allow passing the data around, or make PageImages a hard dependency (which would exclude building this into core)

  1. How many articles have images? And are we fine if the majority don't have any?

See previous comment.

  1. What do we want to do for non-content namespaces?

Those will usually not have thumbnails. PageImages doesn't even capture them outside the main namespace (we could change that if needed), but most of those pages doesn't even have any imagery anyway (and for those that do, it likely often isn't relevant; e.g. the random imagery on User pages)

  1. What is the optimal resolution for likely-warm-in-cache to maximize image delivery?

wgThumbLimits sizes are most likely to be pre-generated, warm in cache. Other sizes could be requested, but it may take some time to generate those images in those sizes (and for something like search that may generate many at once, this may cause too many requests where images fail to be served)
These widths are: 120, 150, 180, 200, 220, 250, 300, 400 (along with 1.5x & 2x responsive versions)
We can always downscale the thumbnail on the client, but upscaling will lose quality.
Basically, anything up to 400px would be fine.

Tech questions have been answered. Moving to "Needs Design" for design questions.

I have explored a few options as shown in the image below. I am leaning more towards option 2 where we align the thumbnail to the top. The height of the thumbnail may vary but the width will always remain the same and you always see the entire image. And option 4 is good too if we are able to do it in a way we don't crop of people's faces but I know it is hard to predict where in the image the face is. Let me know your thoughts.

Thumbnails.png (312×623 px, 64 KB)

Also regarding namespace results, I am thinking we should treat them differently with no thumbnails and only have thumbnails for articles (even if there is no image use an empty state thumbnail for articles).
Having namespace results and articles look different is good because I am thinking we should reserve quick view only for articles and so that visual difference will help users understand the behaviour.

@matthiasmullie @Seddon

I think the Go bar autocomplete thumbnails are currently set to option 3 in the image above. But it would not look good in a bigger size if things were cropped inappropriately.

New Vector does #3, a simple centered fill, indeed.
#4 would also be easy to do - we still can't predict what'll fall off, but cropped faces would become a less likely occurrence (though other important stuff might be in the lower parts)

I didn't have to look too hard to find examples of images where the height or width is at least 2x the size of the other measurement. E.g. wide landscape or high portrait.

Filling out a square would lose a lot of detail in these, but having wildly inconsistent thumb dimensions may look awful as well.

Personally, I think I prefer #3 & #4 over #1 & #2 for this specific purpose: losing some cropped details isn't too important IMO (it'd still help you understand what the page is about), but I'm afraid that the visual inconsistency of having a variety of formats & their positioning might be rather distracting (or not, IDK, haven't actually seen it in action, assumptions may be all wrong :p)

Hmm I saw your examples and yeah some of those very wide and very tall examples would not work so well for 1 and 2. if we are able to do #4 easily (which I assumed it was not easy to do since we don't do it elsewhere) then we can go with that. We will have a full image in the quick view so agree a bit of cropping may look better than some very wide/tall images fitted in a box. I am guessing if we show the top of the vertical image most of the time we may be able to avoid cropped faces.

So here is an example of how cropped will look. Note that only first image is showing the top of the image rest are not but if we go with option 4 they all would.

Special_Search.png (1×1 px, 286 KB)

And example of option 1 (2 would look very similar)

Special_Search.png (1×1 px, 254 KB)