Commit 1b7442fc authored by Florian Münchbach's avatar Florian Münchbach Committed by Florian Münchbach

[syntax_highlight] Integrate with master branch

This commit contains all changes starting with the first integration of
the plugin into the gajim_1.1 branch (76dabe2e) until the current plugin
version V3 (42b9aeb8).
The manifest.ini is updated for compatibility with upcoming Gajim
versions.
parent 85648899
This diff is collapsed.
# Syntax Highlighting Plugin for Gajim
[Gajim](https://gajim.org) Plugin that highlights source code blocks in chatbox.
## Installation
The recommended way of installing this plugin is to use
Gajim's Plugin Installer.
For more information and instruction on how to install plugins manually, please
refer to the [Gajim Plugin Wiki seite](https://dev.gajim.org/gajim/gajim-plugins/wikis/home#how-to-install-plugins).
## Usage
This plugin uses markdown-style syntax to identify which parts of a message
should be formatted as code in the chatbox.
```
Inline source code will be highlighted when placed in between `two single
back-ticks`.
```
The language used to highlight the syntax of inline code is selected as the
default language in the plugin settings.
Multi-line code blocks are started by three back-ticks followed by a newline.
Optionally, a language can be specified directly after the opening back-ticks and
before the line break:
````
```language
Note, that the last line of a code block may only contain the closing back-ticks,
i.e. there must be a newline here.
```
````
In case no languge is specified with the opening tag or the specified language
could not be identified, the default languge configured in the settings is
used.
You can test it by copying and sending the following text to one of your
contacts:
````
```python
def test():
print("Hello, world!")
```
````
(**Note:** your contact will not receive highlighted text unless she is also
using the plugin.)
## Relation to XEP-0393 - 'Message Styling'
https://xmpp.org/extensions/xep-0393.html#pre-block
In [XEP-0393](https://xmpp.org/extensions/xep-0393.html),
the back-tick based syntax is defined as markup for preformatted
text blocks, respectively inline performatted text.
Formatting of such text blocks with monospaced fonts is recommended by the XEP.
By using the same syntax as defined in XEP-0393 XMPP clients with only XEP-0393
support but without syntax highlighting can at least present their users blocks
of pre-formatted text.
Since text in between the back-tick markers is not further formatted by this
plugin, it can be considered "pre-formatted".
Hence, this plugin is compatible to the formatting options defined by XEP-0393,
[section 5.1.2, "Preformatted Text"](https://xmpp.org/extensions/xep-0393.html#pre-block)
and [section 5.2.5, "Preformatted Span"](https://xmpp.org/extensions/xep-0393.html#mono).
Nevertheless, syntax highlighting for source code is not part of XEP but
rather a non-standard extension introduced with this plugin.
## Configuration
The configuration can be found via 'Gajim' > 'Plugins', then select the
'Source Code Syntax Highlight' Plugin and click the gears symbol.
The configuration options let you specify many details how code is formatted,
including default language, style, font settings, background color and formatting
of the code markers.
In the configuration window, the current settings are displayed in an
interactive preview pannel. This allows you to directly check how code would
look like in the message
window.
## Report Bugs and Feature Requests
For bug reports, please report them to the [Gajim Plugin Issue tracker](https://dev.gajim.org/gajim/gajim-plugins/issues/new?issue[FlorianMuenchbach]=&issue[description]=Gajim%20Version%3A%20%0APlugin%20Version%3A%0AOperating%20System%3A&issue[title]=[syntax_highlight]).
Please make sure that the issue you create contains `[syntax_highlight]` in the
title and information such as Gajim version, Plugin version, Operating system,
etc.
## Debug
The plugin adds its own logger. It can be used to set a specific debug level
for this plugin and/or filter log messages.
Run
```
gajim --loglevel gajim.plugin_system.syntax_highlight=DEBUG
```
in a terminal to display the debug messages.
## Known Issues / ToDo
* ~~Gajim crashes when correcting a message containing highlighted code.~~
(fixed in version 1.1.0)
## Credits
Since I had no experience in writing Plugins for Gajim, I used the
[Latex Plugin](https://trac-plugins.gajim.org/wiki/LatexPlugin)
written by Yves Fischer and Yann Leboulanger as an example and copied a big
portion of initial code. Therefore, credits go to the authors of the Latex
Plugin for providing an example.
The syntax highlighting itself is done by [pygments](http://pygments.org/).
from .syntax_highlight import SyntaxHighlighterPlugin
import logging
import re
import pygments
from gi.repository import Gtk
from gajim.plugins.helpers import log
from .gtkformatter import GTKFormatter
from .types import MatchType, LineBreakOptions, CodeMarkerOptions
log = logging.getLogger('gajim.plugin_system.syntax_highlight')
class ChatSyntaxHighlighter:
def hide_code_markup(self, buf, start, end):
tag = buf.get_tag_table().lookup('hide_code_markup')
if tag is None:
tag = Gtk.TextTag.new('hide_code_markup')
tag.set_property('invisible', True)
buf.get_tag_table().add(tag)
buf.apply_tag_by_name('hide_code_markup', start, end)
def check_line_break(self, is_multiline):
line_break = self.config.get_line_break_action()
return (line_break == LineBreakOptions.ALWAYS) \
or (is_multiline and line_break == LineBreakOptions.MULTILINE)
def format_code(self, buf, s_tag, s_code, e_tag, e_code, language):
style = self.config.get_style_name()
if self.config.get_code_marker_setting() == CodeMarkerOptions.HIDE:
self.hide_code_markup(buf, s_tag, s_code)
self.hide_code_markup(buf, e_code, e_tag)
else:
comment_tag = GTKFormatter.create_tag_for_token(
pygments.token.Comment,
pygments.styles.get_style_by_name(style))
buf.get_tag_table().add(comment_tag)
buf.apply_tag(comment_tag, s_tag, s_code)
buf.apply_tag(comment_tag, e_tag, e_code)
code = s_code.get_text(e_code)
log.debug("full text to encode: %s.", code)
start_mark = buf.create_mark(None, s_code, False)
lexer = None
if language is None:
lexer = self.config.get_default_lexer()
log.info("No Language specified. Falling back to default lexer: %s.",
self.config.get_default_lexer_name())
else:
log.debug("Using lexer for %s.", str(language))
lexer = self.config.get_lexer_with_fallback(language)
if lexer is None:
iterator = buf.get_iter_at_mark(start_mark)
buf.insert(iterator, '\n')
elif not self.config.is_internal_none_lexer(lexer):
tokens = pygments.lex(code, lexer)
formatter = GTKFormatter(style=style, start_mark=start_mark)
pygments.format(tokens, formatter, buf)
def find_multiline_matches(self, text):
start = None
matches = []
#Less strict, allow prefixed whitespaces: for i in re.finditer(r'(?:^|\n)[ |\t]*(```)\S*[ |\t]*(?:\n|$)', text, re.DOTALL):
for i in re.finditer(r'(?:^|\n)(```)\S*(?:\n|$)', text, re.DOTALL):
if start is None:
start = i
elif re.match(r'^\n```', i.group(0)) is not None:
matches.append(
(start.start(), i.end(), text[start.start():i.end()]))
start = None
else:
# not an end...
continue
return matches
def find_inline_matches(self, text):
"""
Inline code is highlighted if the start marker is precedded by a start
of line, a whitespace character or either of the other span markers
defined in XEP-0393.
The same applies mirrored to the end marker.
"""
return [(i.start(1), i.end(1), i.group(1)) for i in \
re.finditer(r'(?:^|\s|\*|~|_)(`((?!`).+?)`)(?:\s|\*|~|_|$)', text)]
def merge_match_groups(self, real_text, inline_matches, multiline_matches):
it_inline = iter(inline_matches)
it_multi = iter(multiline_matches)
length = len(real_text)
# Just to get cleaner code below...
def get_next(iterator):
return next(iterator, (length, length, ""))
# In order to simplify the process, we use the 'length' here.
cur_inline = get_next(it_inline)
cur_multi = get_next(it_multi)
pos = 0
# This will contain tuples with parts of the input and its classification
parts = []
while pos < length:
log.debug("-> in: %s", str(cur_inline))
log.debug("-> mu: %s", str(cur_multi))
# selected = (start, end, type)
selected = (cur_inline[0], cur_inline[1], MatchType.INLINE) \
if cur_inline[0] < cur_multi[0] \
else (cur_multi[0], cur_multi[1], MatchType.MULTILINE) \
if cur_multi[0] < length \
else (pos, length, MatchType.TEXT)
log.debug("--> select: %s", str(selected))
# Handle plain text string parts (and unforseen errors...)
if pos < selected[0]:
end = selected[0] if selected[0] != pos else selected[1]
parts.append((real_text[pos:end], MatchType.TEXT))
pos = selected[0]
elif pos > selected[0]:
log.error("Should not happen, position > found match.")
# Cut out and append selected text segment
parts.append((real_text[selected[0]:selected[1]], selected[2]))
pos = selected[1]
# Depending on the match type, we have to forward the iterators.
# Also, forward the other one, if regions overlap or we took over...
if selected[2] == MatchType.INLINE:
if cur_multi[0] < cur_inline[1]:
cur_multi = get_next(it_multi)
cur_inline = get_next(it_inline)
elif selected[2] == MatchType.MULTILINE:
if cur_inline[0] < cur_multi[1]:
cur_inline = get_next(it_inline)
cur_multi = get_next(it_multi)
return parts
def process_text(self, real_text, other_tags, _graphics, iter_,
_additional):
def fix_newline(char, marker_len_no_newline, force=False):
fixed = (marker_len_no_newline, '')
if char == '\n':
fixed = (marker_len_no_newline + 1, '')
elif force:
fixed = (marker_len_no_newline + 1, '\n')
return fixed
buf = self.textview.tv.get_buffer()
# first, try to find inline or multiline code snippets
inline_matches = self.find_inline_matches(real_text)
multiline_matches = self.find_multiline_matches(real_text)
if not inline_matches and not multiline_matches:
log.debug("Stopping early, since there is no code block in it....")
return
iterator = iter_ if iter_ is not None else buf.get_end_iter()
# Create a start marker with left gravity before inserting text.
start_mark = buf.create_mark("SHP_start", iterator, True)
end_mark = buf.create_mark("SHP_end", iterator, False)
insert_newline_for_multiline = self.check_line_break(True)
insert_newline_for_inline = self.check_line_break(False)
split_text = self.merge_match_groups(
real_text, inline_matches, multiline_matches)
buf.begin_user_action()
for num, (text_to_insert, match_type) in enumerate(split_text):
language = None
end_of_message = num == (len(split_text) - 1)
if match_type == MatchType.TEXT:
self.textview.detect_and_print_special_text(
text_to_insert, other_tags, graphics=_graphics,
iter_=iterator, additional_data=_additional)
else:
if match_type == MatchType.MULTILINE:
language_match = re.search(
'\n*```([^\n]*)\n', text_to_insert, re.DOTALL)
language = None if language_match is None \
else language_match.group(1)
language_len = 0 if language is None else len(language)
# We account the language word width for the front marker
front = fix_newline(text_to_insert[0], 3 + language_len,
insert_newline_for_multiline)
back = fix_newline(text_to_insert[-1], 3,
insert_newline_for_multiline and not end_of_message)
else:
front = fix_newline(text_to_insert[0], 1,
insert_newline_for_inline)
back = fix_newline(text_to_insert[-1], 1,
insert_newline_for_inline and not end_of_message)
marker_widths = (front[0], back[0])
text_to_insert = ''.join([front[1], text_to_insert, back[1]])
# insertion invalidates iterator, let's use our start mark...
self.insert_and_format_code(buf, text_to_insert, language,
marker_widths, start_mark, end_mark, other_tags)
iterator = buf.get_iter_at_mark(end_mark)
# the current end of the buffer's contents is the start for the
# next iteration
buf.move_mark(start_mark, iterator)
buf.delete_mark(start_mark)
buf.delete_mark(end_mark)
buf.end_user_action()
# We have to make sure this is the last thing we do (i.e. no calls to
# the other textview methods no more from here on), because the
# print_special_text method is resetting the plugin_modified variable...
self.textview.plugin_modified = True
def insert_and_format_code(self, buf, insert_text, language, marker,
start_mark, end_mark, other_tags=None):
start_iter = buf.get_iter_at_mark(start_mark)
if other_tags:
buf.insert_with_tags_by_name(start_iter, insert_text,
*other_tags)
else:
buf.insert(start_iter, insert_text)
tag_start = buf.get_iter_at_mark(start_mark)
tag_end = buf.get_iter_at_mark(end_mark)
s_code = tag_start.copy()
e_code = tag_end.copy()
s_code.forward_chars(marker[0])
e_code.backward_chars(marker[1])
log.debug("full text between tags: %s.", tag_start.get_text(tag_end))
self.format_code(buf, tag_start, s_code, tag_end, e_code, language)
self.textview.plugin_modified = True
# Set general code block format
tag = Gtk.TextTag.new()
if self.config.is_bgcolor_override_enabled():
tag.set_property('background', self.config.get_bgcolor())
tag.set_property('paragraph-background', self.config.get_bgcolor())
tag.set_property('font', self.config.get_font())
buf.get_tag_table().add(tag)
buf.apply_tag(tag, tag_start, tag_end)
def __init__(self, config, textview):
self.last_end_mark = None
self.config = config
self.textview = textview
<?xml version="1.0" encoding="UTF-8"?>
<!-- Generated with glade 3.22.1 -->
<interface>
<requires lib="gtk+" version="3.0"/>
<object class="GtkTextBuffer"/>
<object class="GtkListStore" id="code_marker_selection">
<columns>
<!-- column-name column1 -->
<column type="gchararray"/>
</columns>
<data>
<row>
<col id="0" translatable="yes">Treat as code comment</col>
</row>
<row>
<col id="0" translatable="yes">Hide code markers</col>
</row>
</data>
</object>
<object class="GtkListStore" id="line_break_selection">
<columns>
<!-- column-name Text -->
<column type="gchararray"/>
</columns>
<data>
<row>
<col id="0" translatable="yes">Never</col>
</row>
<row>
<col id="0" translatable="yes">Always</col>
</row>
<row>
<col id="0" translatable="yes">Only around multi-line code blocks</col>
</row>
</data>
</object>
<object class="GtkTextBuffer" id="preview_textbuffer">
<property name="text" translatable="yes">// Test your highlighting here
# Test your highlighting here
/* Test your highlighting here */
% Test your highlighting here
; Test your highlighting here
&lt;!-- Test your highlighting here --&gt;</property>
</object>
<object class="GtkWindow" id="window1">
<property name="can_focus">False</property>
<child>
<placeholder/>
</child>
<child>
<object class="GtkBox" id="mainBox">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="orientation">vertical</property>
<child>
<object class="GtkGrid">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="margin_bottom">40</property>
<child>
<object class="GtkLabel" id="label1">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="label" translatable="yes">Default Language for Syntax Highlighting</property>
<property name="xalign">0</property>
</object>
<packing>
<property name="left_attach">0</property>
<property name="top_attach">0</property>
</packing>
</child>
<child>
<object class="GtkLabel" id="label2">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="label" translatable="yes">Insert Line breaks around Code Blocks</property>
<property name="xalign">0</property>
</object>
<packing>
<property name="left_attach">0</property>
<property name="top_attach">1</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="default_lexer_combobox">
<property name="visible">True</property>
<property name="can_focus">False</property>
<signal name="changed" handler="lexer_changed" swapped="no"/>
<child>
<object class="GtkCellRendererText" id="cellrenderertext1"/>
<attributes>
<attribute name="text">0</attribute>
</attributes>
</child>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">0</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="line_break_combobox">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="model">line_break_selection</property>
<signal name="changed" handler="line_break_changed" swapped="no"/>
<child>
<object class="GtkCellRendererText" id="cellrenderertext2"/>
<attributes>
<attribute name="text">0</attribute>
</attributes>
</child>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">1</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="style_combobox">
<property name="visible">True</property>
<property name="can_focus">False</property>
<signal name="changed" handler="style_changed" swapped="no"/>
<child>
<object class="GtkCellRendererText" id="cellrenderertext3"/>
<attributes>
<attribute name="text">0</attribute>
</attributes>
</child>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">2</property>
</packing>
</child>
<child>
<object class="GtkLabel" id="label3">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="label" translatable="yes">Select Syntax Highlighting Style</property>
<property name="xalign">0</property>
</object>
<packing>
<property name="left_attach">0</property>
<property name="top_attach">2</property>
</packing>
</child>
<child>
<object class="GtkLabel">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="label" translatable="yes">Select code marker (the backticks) formatting:</property>
<property name="xalign">0</property>
</object>
<packing>
<property name="left_attach">0</property>
<property name="top_attach">3</property>
</packing>
</child>
<child>
<object class="GtkComboBox" id="code_marker_combobox">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="model">code_marker_selection</property>
<signal name="changed" handler="code_marker_changed" swapped="no"/>
<child>
<object class="GtkCellRendererText" id="cellrenderertext4"/>
<attributes>
<attribute name="text">0</attribute>
</attributes>
</child>
</object>
<packing>
<property name="left_attach">1</property>
<property name="top_attach">3</property>
</packing>
</child>
<child>
<object class="GtkLabel">
<property name="visible">True</property>
<property name="can_focus">False</property>
<property name="label" translatable="yes">Select Font for Code Snippets:</property>
<property name="xalign">0</property>
</object>
<packing>
<property name="left_attach">0</property>
<property name="top_attach">4</property>
</packing>
</child>
<child>
<object class="GtkFontButton" id="font_button">
<property name="visible">True</property>
<property name="can_focus">True</property>
<property name="receives_default">True</property>
<property name="font">Sans 12</property>
<property name="language">de-de</property>
<property name="preview_text"/>
<property name="use_font">True</property>
<signal name="font-set" handler="font_changed" swapped="no"/>
</object>
<packing>
<property name="left_attach">1</property>