Software zum Installieren eines Smart-Mirror Frameworks , zum Nutzen von hochschulrelevanten Informationen, auf einem Raspberry-Pi.
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

README.md 15KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404
  1. # diff-sequences
  2. Compare items in two sequences to find a **longest common subsequence**.
  3. The items not in common are the items to delete or insert in a **shortest edit script**.
  4. To maximize flexibility and minimize memory, you write **callback** functions as configuration:
  5. **Input** function `isCommon(aIndex, bIndex)` compares items at indexes in the sequences and returns a truthy/falsey value. This package might call your function more than once for some pairs of indexes.
  6. - Because your function encapsulates **comparison**, this package can compare items according to `===` operator, `Object.is` method, or other criterion.
  7. - Because your function encapsulates **sequences**, this package can find differences in arrays, strings, or other data.
  8. **Output** function `foundSubsequence(nCommon, aCommon, bCommon)` receives the number of adjacent items and starting indexes of each common subsequence. If sequences do not have common items, then this package does not call your function.
  9. If N is the sum of lengths of sequences and L is length of a longest common subsequence, then D = N – 2L is the number of **differences** in the corresponding shortest edit script.
  10. [_An O(ND) Difference Algorithm and Its Variations_](http://xmailserver.org/diff2.pdf) by Eugene W. Myers is fast when sequences have **few** differences.
  11. This package implements the **linear space** variation with optimizations so it is fast even when sequences have **many** differences.
  12. ## Usage
  13. To add this package as a dependency of a project, do either of the following:
  14. - `npm install diff-sequences`
  15. - `yarn add diff-sequences`
  16. To use `diff` as the name of the default export from this package, do either of the following:
  17. - `var diff = require('diff-sequences').default; // CommonJS modules`
  18. - `import diff from 'diff-sequences'; // ECMAScript modules`
  19. Call `diff` with the **lengths** of sequences and your **callback** functions:
  20. ```js
  21. const a = ['a', 'b', 'c', 'a', 'b', 'b', 'a'];
  22. const b = ['c', 'b', 'a', 'b', 'a', 'c'];
  23. function isCommon(aIndex, bIndex) {
  24. return a[aIndex] === b[bIndex];
  25. }
  26. function foundSubsequence(nCommon, aCommon, bCommon) {
  27. // see examples
  28. }
  29. diff(a.length, b.length, isCommon, foundSubsequence);
  30. ```
  31. ## Example of longest common subsequence
  32. Some sequences (for example, `a` and `b` in the example of usage) have more than one longest common subsequence.
  33. This package finds the following common items:
  34. | comparisons of common items | values | output arguments |
  35. | :------------------------------- | :--------- | --------------------------: |
  36. | `a[2] === b[0]` | `'c'` | `foundSubsequence(1, 2, 0)` |
  37. | `a[4] === b[1]` | `'b'` | `foundSubsequence(1, 4, 1)` |
  38. | `a[5] === b[3] && a[6] === b[4]` | `'b', 'a'` | `foundSubsequence(2, 5, 3)` |
  39. The “edit graph” analogy in the Myers paper shows the following common items:
  40. | comparisons of common items | values |
  41. | :------------------------------- | :--------- |
  42. | `a[2] === b[0]` | `'c'` |
  43. | `a[3] === b[2] && a[4] === b[3]` | `'a', 'b'` |
  44. | `a[6] === b[4]` | `'a'` |
  45. Various packages which implement the Myers algorithm will **always agree** on the **length** of a longest common subsequence, but might **sometimes disagree** on which **items** are in it.
  46. ## Example of callback functions to count common items
  47. ```js
  48. // Return length of longest common subsequence according to === operator.
  49. function countCommonItems(a, b) {
  50. let n = 0;
  51. function isCommon(aIndex, bIndex) {
  52. return a[aIndex] === b[bIndex];
  53. }
  54. function foundSubsequence(nCommon) {
  55. n += nCommon;
  56. }
  57. diff(a.length, b.length, isCommon, foundSubsequence);
  58. return n;
  59. }
  60. const commonLength = countCommonItems(
  61. ['a', 'b', 'c', 'a', 'b', 'b', 'a'],
  62. ['c', 'b', 'a', 'b', 'a', 'c'],
  63. );
  64. ```
  65. | category of items | expression | value |
  66. | :----------------- | ------------------------: | ----: |
  67. | in common | `commonLength` | `4` |
  68. | to delete from `a` | `a.length - commonLength` | `3` |
  69. | to insert from `b` | `b.length - commonLength` | `2` |
  70. If the length difference `b.length - a.length` is:
  71. - negative: its absolute value is the minimum number of items to **delete** from `a`
  72. - positive: it is the minimum number of items to **insert** from `b`
  73. - zero: there is an **equal** number of items to delete from `a` and insert from `b`
  74. - non-zero: there is an equal number of **additional** items to delete from `a` and insert from `b`
  75. In this example, `6 - 7` is:
  76. - negative: `1` is the minimum number of items to **delete** from `a`
  77. - non-zero: `2` is the number of **additional** items to delete from `a` and insert from `b`
  78. ## Example of callback functions to find common items
  79. ```js
  80. // Return array of items in longest common subsequence according to Object.is method.
  81. const findCommonItems = (a, b) => {
  82. const array = [];
  83. diff(
  84. a.length,
  85. b.length,
  86. (aIndex, bIndex) => Object.is(a[aIndex], b[bIndex]),
  87. (nCommon, aCommon) => {
  88. for (; nCommon !== 0; nCommon -= 1, aCommon += 1) {
  89. array.push(a[aCommon]);
  90. }
  91. },
  92. );
  93. return array;
  94. };
  95. const commonItems = findCommonItems(
  96. ['a', 'b', 'c', 'a', 'b', 'b', 'a'],
  97. ['c', 'b', 'a', 'b', 'a', 'c'],
  98. );
  99. ```
  100. | `i` | `commonItems[i]` | `aIndex` |
  101. | --: | :--------------- | -------: |
  102. | `0` | `'c'` | `2` |
  103. | `1` | `'b'` | `4` |
  104. | `2` | `'b'` | `5` |
  105. | `3` | `'a'` | `6` |
  106. ## Example of callback functions to diff index intervals
  107. Instead of slicing array-like objects, you can adjust indexes in your callback functions.
  108. ```js
  109. // Diff index intervals that are half open [start, end) like array slice method.
  110. const diffIndexIntervals = (a, aStart, aEnd, b, bStart, bEnd) => {
  111. // Validate: 0 <= aStart and aStart <= aEnd and aEnd <= a.length
  112. // Validate: 0 <= bStart and bStart <= bEnd and bEnd <= b.length
  113. diff(
  114. aEnd - aStart,
  115. bEnd - bStart,
  116. (aIndex, bIndex) => Object.is(a[aStart + aIndex], b[bStart + bIndex]),
  117. (nCommon, aCommon, bCommon) => {
  118. // aStart + aCommon, bStart + bCommon
  119. },
  120. );
  121. // After the last common subsequence, do any remaining work.
  122. };
  123. ```
  124. ## Example of callback functions to emulate diff command
  125. Linux or Unix has a `diff` command to compare files line by line. Its output is a **shortest edit script**:
  126. - **c**hange adjacent lines from the first file to lines from the second file
  127. - **d**elete lines from the first file
  128. - **a**ppend or insert lines from the second file
  129. ```js
  130. // Given zero-based half-open range [start, end) of array indexes,
  131. // return one-based closed range [start + 1, end] as string.
  132. const getRange = (start, end) =>
  133. start + 1 === end ? `${start + 1}` : `${start + 1},${end}`;
  134. // Given index intervals of lines to delete or insert, or both, or neither,
  135. // push formatted diff lines onto array.
  136. const pushDelIns = (aLines, aIndex, aEnd, bLines, bIndex, bEnd, array) => {
  137. const deleteLines = aIndex !== aEnd;
  138. const insertLines = bIndex !== bEnd;
  139. const changeLines = deleteLines && insertLines;
  140. if (changeLines) {
  141. array.push(getRange(aIndex, aEnd) + 'c' + getRange(bIndex, bEnd));
  142. } else if (deleteLines) {
  143. array.push(getRange(aIndex, aEnd) + 'd' + String(bIndex));
  144. } else if (insertLines) {
  145. array.push(String(aIndex) + 'a' + getRange(bIndex, bEnd));
  146. } else {
  147. return;
  148. }
  149. for (; aIndex !== aEnd; aIndex += 1) {
  150. array.push('< ' + aLines[aIndex]); // delete is less than
  151. }
  152. if (changeLines) {
  153. array.push('---');
  154. }
  155. for (; bIndex !== bEnd; bIndex += 1) {
  156. array.push('> ' + bLines[bIndex]); // insert is greater than
  157. }
  158. };
  159. // Given content of two files, return emulated output of diff utility.
  160. const findShortestEditScript = (a, b) => {
  161. const aLines = a.split('\n');
  162. const bLines = b.split('\n');
  163. const aLength = aLines.length;
  164. const bLength = bLines.length;
  165. const isCommon = (aIndex, bIndex) => aLines[aIndex] === bLines[bIndex];
  166. let aIndex = 0;
  167. let bIndex = 0;
  168. const array = [];
  169. const foundSubsequence = (nCommon, aCommon, bCommon) => {
  170. pushDelIns(aLines, aIndex, aCommon, bLines, bIndex, bCommon, array);
  171. aIndex = aCommon + nCommon; // number of lines compared in a
  172. bIndex = bCommon + nCommon; // number of lines compared in b
  173. };
  174. diff(aLength, bLength, isCommon, foundSubsequence);
  175. // After the last common subsequence, push remaining change lines.
  176. pushDelIns(aLines, aIndex, aLength, bLines, bIndex, bLength, array);
  177. return array.length === 0 ? '' : array.join('\n') + '\n';
  178. };
  179. ```
  180. ## Example of callback functions to format diff lines
  181. Here is simplified code to format **changed and unchanged lines** in expected and received values after a test fails in Jest:
  182. ```js
  183. // Format diff with minus or plus for change lines and space for common lines.
  184. const formatDiffLines = (a, b) => {
  185. // Jest depends on pretty-format package to serialize objects as strings.
  186. // Unindented for comparison to avoid distracting differences:
  187. const aLinesUn = format(a, {indent: 0 /*, other options*/}).split('\n');
  188. const bLinesUn = format(b, {indent: 0 /*, other options*/}).split('\n');
  189. // Indented to display changed and unchanged lines:
  190. const aLinesIn = format(a, {indent: 2 /*, other options*/}).split('\n');
  191. const bLinesIn = format(b, {indent: 2 /*, other options*/}).split('\n');
  192. const aLength = aLinesIn.length; // Validate: aLinesUn.length === aLength
  193. const bLength = bLinesIn.length; // Validate: bLinesUn.length === bLength
  194. const isCommon = (aIndex, bIndex) => aLinesUn[aIndex] === bLinesUn[bIndex];
  195. // Only because the GitHub Flavored Markdown doc collapses adjacent spaces,
  196. // this example code and the following table represent spaces as middle dots.
  197. let aIndex = 0;
  198. let bIndex = 0;
  199. const array = [];
  200. const foundSubsequence = (nCommon, aCommon, bCommon) => {
  201. for (; aIndex !== aCommon; aIndex += 1) {
  202. array.push('-·' + aLinesIn[aIndex]); // delete is minus
  203. }
  204. for (; bIndex !== bCommon; bIndex += 1) {
  205. array.push('+·' + bLinesIn[bIndex]); // insert is plus
  206. }
  207. for (; nCommon !== 0; nCommon -= 1, aIndex += 1, bIndex += 1) {
  208. // For common lines, received indentation seems more intuitive.
  209. array.push('··' + bLinesIn[bIndex]); // common is space
  210. }
  211. };
  212. diff(aLength, bLength, isCommon, foundSubsequence);
  213. // After the last common subsequence, push remaining change lines.
  214. for (; aIndex !== aLength; aIndex += 1) {
  215. array.push('-·' + aLinesIn[aIndex]);
  216. }
  217. for (; bIndex !== bLength; bIndex += 1) {
  218. array.push('+·' + bLinesIn[bIndex]);
  219. }
  220. return array;
  221. };
  222. const expected = {
  223. searching: '',
  224. sorting: {
  225. ascending: true,
  226. fieldKey: 'what',
  227. },
  228. };
  229. const received = {
  230. searching: '',
  231. sorting: [
  232. {
  233. descending: false,
  234. fieldKey: 'what',
  235. },
  236. ],
  237. };
  238. const diffLines = formatDiffLines(expected, received);
  239. ```
  240. If N is the sum of lengths of sequences and L is length of a longest common subsequence, then N – L is length of an array of diff lines. In this example, N is 7 + 9, L is 5, and N – L is 11.
  241. | `i` | `diffLines[i]` | `aIndex` | `bIndex` |
  242. | ---: | :--------------------------------- | -------: | -------: |
  243. | `0` | `'··Object {'` | `0` | `0` |
  244. | `1` | `'····"searching": "",'` | `1` | `1` |
  245. | `2` | `'-···"sorting": Object {'` | `2` | |
  246. | `3` | `'-·····"ascending": true,'` | `3` | |
  247. | `4` | `'+·····"sorting": Array ['` | | `2` |
  248. | `5` | `'+·······Object {'` | | `3` |
  249. | `6` | `'+·········"descending": false,'` | | `4` |
  250. | `7` | `'··········"fieldKey": "what",'` | `4` | `5` |
  251. | `8` | `'········},'` | `5` | `6` |
  252. | `9` | `'+·····],'` | | `7` |
  253. | `10` | `'··}'` | `6` | `8` |
  254. ## Example of callback functions to find diff items
  255. Here is simplified code to find changed and unchanged substrings **within adjacent changed lines** in expected and received values after a test fails in Jest:
  256. ```js
  257. // Return diff items for strings (compatible with diff-match-patch package).
  258. const findDiffItems = (a, b) => {
  259. const isCommon = (aIndex, bIndex) => a[aIndex] === b[bIndex];
  260. let aIndex = 0;
  261. let bIndex = 0;
  262. const array = [];
  263. const foundSubsequence = (nCommon, aCommon, bCommon) => {
  264. if (aIndex !== aCommon) {
  265. array.push([-1, a.slice(aIndex, aCommon)]); // delete is -1
  266. }
  267. if (bIndex !== bCommon) {
  268. array.push([1, b.slice(bIndex, bCommon)]); // insert is 1
  269. }
  270. aIndex = aCommon + nCommon; // number of characters compared in a
  271. bIndex = bCommon + nCommon; // number of characters compared in b
  272. array.push([0, a.slice(aCommon, aIndex)]); // common is 0
  273. };
  274. diff(a.length, b.length, isCommon, foundSubsequence);
  275. // After the last common subsequence, push remaining change items.
  276. if (aIndex !== a.length) {
  277. array.push([-1, a.slice(aIndex)]);
  278. }
  279. if (bIndex !== b.length) {
  280. array.push([1, b.slice(bIndex)]);
  281. }
  282. return array;
  283. };
  284. const expectedDeleted = ['"sorting": Object {', '"ascending": true,'].join(
  285. '\n',
  286. );
  287. const receivedInserted = [
  288. '"sorting": Array [',
  289. 'Object {',
  290. '"descending": false,',
  291. ].join('\n');
  292. const diffItems = findDiffItems(expectedDeleted, receivedInserted);
  293. ```
  294. | `i` | `diffItems[i][0]` | `diffItems[i][1]` |
  295. | --: | ----------------: | :---------------- |
  296. | `0` | `0` | `'"sorting": '` |
  297. | `1` | `1` | `'Array [\n'` |
  298. | `2` | `0` | `'Object {\n"'` |
  299. | `3` | `-1` | `'a'` |
  300. | `4` | `1` | `'de'` |
  301. | `5` | `0` | `'scending": '` |
  302. | `6` | `-1` | `'tru'` |
  303. | `7` | `1` | `'fals'` |
  304. | `8` | `0` | `'e,'` |
  305. The length difference `b.length - a.length` is equal to the sum of `diffItems[i][0]` values times `diffItems[i][1]` lengths. In this example, the difference `48 - 38` is equal to the sum `10`.
  306. | category of diff item | `[0]` | `[1]` lengths | subtotal |
  307. | :-------------------- | ----: | -----------------: | -------: |
  308. | in common | `0` | `11 + 10 + 11 + 2` | `0` |
  309. | to delete from `a` | `–1` | `1 + 3` | `-4` |
  310. | to insert from `b` | `1` | `8 + 2 + 4` | `14` |
  311. Instead of formatting the changed substrings with escape codes for colors in the `foundSubsequence` function to save memory, this example spends memory to **gain flexibility** before formatting, so a separate heuristic algorithm might modify the generic array of diff items to show changes more clearly:
  312. | `i` | `diffItems[i][0]` | `diffItems[i][1]` |
  313. | --: | ----------------: | :---------------- |
  314. | `6` | `-1` | `'true'` |
  315. | `7` | `1` | `'false'` |
  316. | `8` | `0` | `','` |
  317. For expected and received strings of serialized data, the result of finding changed **lines**, and then finding changed **substrings** within adjacent changed lines (as in the preceding two examples) sometimes displays the changes in a more intuitive way than the result of finding changed substrings, and then splitting them into changed and unchanged lines.