Page MenuHomePhabricator

Empty table cells aren't serialized correctly
Closed, ResolvedPublic

Description

% echo "<table><tr><td>x</td><td align="center" data-parsoid='{&quot;stx&quot;:&quot;row&quot;}'></td><td data-parsoid='{&quot;stx&quot;:&quot;row&quot;}'>y</td></tr></table>" | parse.js --html2wt
{|
|x|| align="center" |||y
|}

There should have been a <nowiki/> between the tds so that output looked like |x|| align="center" |<nowiki/>||y

The incorrect html2wt Parsoid output now parses as:

<table><tr>
<td>x</td>
<td>align="center"</td>
<td>y</td></tr></table>

Event Timeline

ssastry triaged this task as Medium priority.May 14 2018, 2:20 PM
ssastry renamed this task from Misisng required nowiki in table syntax during html2wt to Missing required nowiki in table syntax during html2wt.May 14 2018, 9:54 PM

Oh hmm .. bad test case.

[subbu@earth:~/work/wmf/mediawiki] echo "<table><tr><td>x</td><td data-parsoid='{&quot;stx&quot;:&quot;row&quot;}'></td><td data-parsoid='{&quot;stx&quot;:&quot;row&quot;}'>y</td></tr></table>" | parse.js --html2wt | parse.js --normalize=parsoid

<table>
<tbody>
<tr>
<td>x</td>
<td></td>
<td>y</td>
</tr>
</tbody>
</table>

[subbu@earth:~/work/wmf/mediawiki] echo "<table><tr><td>x</td><td data-parsoid='{&quot;stx&quot;:&quot;row&quot;}'></td><td data-parsoid='{&quot;stx&quot;:&quot;row&quot;}'>y</td></tr></table>" | parse.js --html2wt | php maintenance/parse.php 
parse.php: warning: reading wikitext from STDIN. Press CTRL+D to parse.

<div class="mw-parser-output"><table>
<tr>
<td>x</td>
<td></td>
<td>y
</td></tr></table>
</div>

Oh hmm .. bad test case.

I got the parse ambiguity flipped around. Requires td attributes and empty content. Fixed test case.

Change 433087 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] WIP: Serialize empty <td>s as <nowiki/> in certain scenarios

https://gerrit.wikimedia.org/r/433087

Change 433087 abandoned by Subramanya Sastry:
Serialize empty <td>s as <nowiki/> in certain scenarios

Reason:
I was not thinking straight. Instead of all this nowiki crap, I should simply emit a space there.

https://gerrit.wikimedia.org/r/433087

Change 433103 had a related patch set uploaded (by Subramanya Sastry; owner: Subramanya Sastry):
[mediawiki/services/parsoid@master] Empty td and th cells are serialized with a single whitespace char

https://gerrit.wikimedia.org/r/433103

ssastry renamed this task from Missing required nowiki in table syntax during html2wt to Empty table cells aren't serialized correctly.May 15 2018, 2:13 PM

I decided to eliminate the nowikis and instead a whitespace char in all empty cells which is a much better approach.

Change 433103 merged by jenkins-bot:
[mediawiki/services/parsoid@master] Empty td and th cells are serialized with a single whitespace char

https://gerrit.wikimedia.org/r/433103

Vvjjkkii renamed this task from Empty table cells aren't serialized correctly to 5zcaaaaaaa.Jul 1 2018, 1:10 AM
Vvjjkkii reopened this task as Open.
Vvjjkkii removed ssastry as the assignee of this task.
Vvjjkkii raised the priority of this task from Medium to High.
Vvjjkkii updated the task description. (Show Details)
Vvjjkkii removed subscribers: gerritbot, Aklapper.
CommunityTechBot renamed this task from 5zcaaaaaaa to Empty table cells aren't serialized correctly.Jul 2 2018, 5:31 AM
CommunityTechBot closed this task as Resolved.
CommunityTechBot assigned this task to ssastry.
CommunityTechBot lowered the priority of this task from High to Medium.
CommunityTechBot updated the task description. (Show Details)
CommunityTechBot added subscribers: gerritbot, Aklapper.