#indiewebcamp 2013-11-15

2013-11-15 UTC
_6a68 and tantek joined the channel
#
bret
jekyll does future posts, but its a simple date sort, nothing more
#
bret
you can prevent future posts from rendering too
#
bret
not sure if thats wiki worthy though
_6a68 joined the channel
#
bret
also, in flat file applications, are you litterally reading a post file everytime someone requests to look at it?
#
bret
or do you process those files into ram or something?
#
aaronpk
p3k processes the flat file and caches it in RAM. not the whole HTML page, just the HTML after building the page's content. (header/sidebar/footer stuff is not cached along with the page)
tantek joined the channel
#
tantek
flakey network + log caching = am I speaking to myself or not
#
tantek
bret - I think it's wiki worthy
#
tantek
so does that mean that jekyll renders future posts by default? instead of not showing them before their time by default?
#
tantek
that seems wrong
#
tantek
bret - in answer to your question, Falcon is literally reading a post file every time someone requests to look at it. Assuming "typical" OS file caching that keeps commonly read files in memory, this is not a problem. See http://indiewebcamp.com/Falcon#Storage_format for details
#
tantek
*posts* file - for that bim
npdoty, scor and ryana joined the channel
#
bret
tantek: sorry had to move locations for a bit
#
tantek
I did too earlier
#
bret
jekyl just renders static pages. It can render future posts, or it can ignore them. default is ignore
#
bret
plus personal sites tend to be fairly low traffic, at least mine is
#
bret
reading #storage_format now
ozten, ryana, thatryana, _6a68 and josephboyle joined the channel
#
tantek
hmm - now I have think about how to write good time code in CASSIS (date stuff is taken care of pretty well)
bnvk, caseorganic and tantek joined the channel
#
tantek
and this is where I consider regretting not storing my datetimes in UTC
#
pdurbin
tantek: having issues?
#
tantek
well I store(d) my datetimes in the more easily verifiable *human* local time
#
tantek
but basically stuck to Pacific Time since no matter where I travel I keep track of that personally
#
tantek
so now I just need to write code to convert that (reliably) to UTC which will be - interesting (given DST etc.)
#
tantek
it's an interesting design challenge. but I think the better option is to keep the data as easily human verifiable as possible.
#
tantek
which means not UTC
#
pdurbin
not epoch seconds? ;)
#
tantek
nah - harder to debug
#
tantek
writing code for easier debugging, even if it means more code
#
tantek
knowing that more time is spent debugging in general
tpinto, scor, squeakytoy, skinny, snarfed, tantek, b0bg0d, bnvk, squeakytoy2 and josephboyle joined the channel
#
KevinMarks_
ooh thanks tantek, I need those functions for noterlive.
#
KevinMarks_
!tell tantek thanks for the twitter length functions. Noterlive needs them
#
Loqi
Ok, I'll tell him that when I see him next
brianloveswords, tpinto, npdoty, cweiske, vrypan, bnvk, b0bg0d, andreypopp, poppy, friedcell, icco, Jihaisse and pfefferle joined the channel
#
Loqi
pfefferle: tantek left you a message 10 hours, 22 minutes ago: do you know Eric? https://twitter.com/ericnakagawa/status/370604732689027072 perhaps he can help with the wordpress conundrums discussed earlier (re: core vs themes etc.)
tantek joined the channel
#
Loqi
tantek: KevinMarks_ left you a message 2 hours, 22 minutes ago: thanks for the twitter length functions. Noterlive needs them
LauraJ joined the channel
#
tantek
KevinMarks_ - glad you can use them! :)
#
tantek
hello
tpinto joined the channel
#
pfefferle
tantek no, doesn't know eric, is he following this chat?
#
tantek
pfefferle not that I know of - he just asked on Twitter
#
pfefferle
tantek ok... will ask him to join the wiki, he can perhaps help me writing a page about theming
#
tantek
that would be great!
bnvk joined the channel
#
tantek
and I just got posting to the future working :)
#
tantek
I can write posts, post-date their published date, and they don't show up on my home page until their datetime passes.
#
tantek
cc: aaronpk, bret
LauraJ joined the channel
#
Jihaisse
tommorris: did that worked ?
#
tantek
tommorris :)
#
tommorris
Jihaisse: have to wait a few days to find out. Postal system and all.
#
tantek
and with that I'm off to bed
#
tommorris
tantek: cool. will look shortly
#
tantek
it's just a description of it. nothing fancy
#
tantek
just got the code flow to properly handle dt-published values that are in the future
#
tantek
it's also good foundational code for "future" posts like events and travel plans
tpinto joined the channel
#
KevinMarks_
tantek did you make a future post?
tantek, friedcell, bnvk and b0bg0d1 joined the channel
#
KevinMarks_
my liking for making things in node goes down with each deployment wait on heroku. Thinking that this isomophic javascript thing isn't worth the setup
bnvk, cweiske, skinny, melvster, barnabywalters, pfenwick, tpinto, friedcell, vrypan and andreypopp joined the channel
#
pdurbin
KevinMarks_: back to PHP? ;)
pfefferle_, bnvk, pfefferle, skinny, andreypopp, squeakytoy, LauraJ, scor and brianloveswords joined the channel
#
@SwissHealthBank
MT @SwissGator: Good morning! Day 2 at #DHC13 - looking fwd to hearing more innov. ways to improve #healthcare #SwissHealthbank #ownyourdata
(twitter.com/_/status/401348179817684992)
andreypopp and indiewebcamp-vis joined the channel
#
barnabywalters
greetings indiewebcamp-vis! use /nick yourname to set your username
andreypopp_, bnvk and barnabywalters_ joined the channel
#
@STCMilwaukee
RT @cassie917: Successfully facilitated over 350 data chats w. scholars at Carver today! #ownyourdata
(twitter.com/_/status/401365702944497664)
andreypopp, skinny, snarfed, bnvk, tantek and ozten joined the channel
#
@KoleFCKnueppel
RT @cassie917: Successfully facilitated over 350 data chats w. scholars at Carver today! #ownyourdata
(twitter.com/_/status/401385181581299712)
skinny, scor, ryana, thatryana, snarfed, caseorganic, fmarier, squeakytoy, tantek, bnvk_, bnvk__, LauraJ, bnvk, andreypopp, paulcp, smus, KevinMarks2, warden, Slyphoria, KevinMarks_, XgF and barnabywalters joined the channel
#
barnabywalters
aaronpk: do you strip the protocol off URLs when saving into your mention archive?
#
aaronpk
I strip off the protocol when building the path on disk to save the parsed data, but I still store the full URL of mentions
#
tantek
that makes sense
#
barnabywalters
okay — and when an update comes, do you overwrite the original (presumably the history is stored in git)?
#
tantek
is this something to capture as part of indiearchive?
#
aaronpk
(to barnaby)
#
barnabywalters
I was just looking at http://indiewebcamp.com/IndieArchive and realised that urls with EPD/SSS/https://domain.com aren’t filesystem-friendly
bnvk joined the channel
#
barnabywalters
due to the need for a folder with an empty name
#
tantek
EPD/SSS?
#
barnabywalters
aaronpk: sounds good. For my use cases I want to be able to easily list and jump between different copies of archived URLs, so I’m making the path into the folder, and then storing EPDSSS.html under that
#
aaronpk
don't double slashes in URLs and filesystem just get collapsed into a single slash automatically?
#
barnabywalters
tantek: epoch days/seconds in that day
#
barnabywalters
aaronpk: if they do, mkdir() in php doesn’t know about it
#
aaronpk
must be a browser and unix tool thing then, not a filesystem thing
#
barnabywalters
this way I can always easily get the latest version of a mention, but list previous versions
#
tantek
barnabywalters do you mean paths?
#
tantek
or are you talking about archiving archives?
#
tantek
I'm a little unclear on what scenario your having problems with.
#
barnabywalters
tantek: referring to the double-slash thing, or the listing-previous-versions thing?
#
tantek
this, what does this mean? (i.e. explain the use case that causes this problem) "urls with EPD/SSS/https://domain.com aren’t filesystem-friendly"
#
aaronpk
I think he's talking about storing the arcives as files on disk with the same folder structure as the URL
#
barnabywalters
I didn’t know that double slashes were collapsed into single ones, so that case could be dealt with by merely replacing https?:// with https?:/
#
barnabywalters
if apache/x other server handles slash collapsing
#
aaronpk
barnabywalters: I don't know what exactly is doing that, but if you type cd /var//www you'll get to the right place
#
tantek
ah ok!
#
tantek
btw - TimBL has since said that had he known better, he would have omitted the "//"
#
tantek
so we can just do that
#
tantek
after the scheme
#
barnabywalters
the colon could be left out unless there’s an explicit port, too
#
barnabywalters
no, forget that
#
tantek
no it can't since "httpsaaronparecki.com" is a valid domain
#
barnabywalters
I’m getting my colons mixed upt
#
aaronpk
is "https:" a valid folder name?
#
barnabywalters
colons should be fine, except on weird OSes which use : as the dir separator
#
barnabywalters
yep, they work
#
tantek
aaronpk - there's the general problem of URL-safe characters being maybe a superset of filename path-safe characters
#
XgF
aaronpk: Per POSIX, double slash can be collapsed everywhere*except* at the start of a path. // and / may point at different locations
#
tantek
so if you're putting a URL in your filesystem, you need to handle that general problem
#
tantek
rather than case by case dealing with ":" etc.
#
barnabywalters
and then what happens when the URL has a bunch of .. in
#
aaronpk
XgF: cool. wonder why PHP doesn't know about it?
#
XgF
aaronpk, barnabywalters: https: is a valid path everywhere but Windows?
#
XgF
aaronpk: Dunno. Because PHP is an epic pile of fail? :-)
#
aaronpk
(he says, to a room of PHP programmers)
josephboyle joined the channel
#
tantek
XgF - hence why those of us that use it use a deliberate subset :)
#
XgF
aaronpk: Hey, I use PHP occasionally too. It's practical, maybe pragmatic, and also disgusting :-)
#
XgF
For the curious, POSIX allows // and / to differ because I think Apollo used // to indicate "other machines" - think similar to Windows UNC paths
#
tantek.com
edited /IndieArchive (+110) "/* URL design */ drop // from archive path format, per TimBL suggestion"
(view diff)
#
tantek
is rocking the citattion-fu today.
#
barnabywalters
“The double slash, though a programming convention at the time” — was // a “programming convention”?
#
aaronpk
comments?
#
barnabywalters
that’s still a convention now
#
XgF
barnabywalters: Maybe taken from Apollo (see my above comment on POSIX)?
#
XgF
Hmm, does anyone know if NeXTSTEP did something similar? TBL developed HTTP/HTML on a NeXT cube
#
barnabywalters
ooh good point
#
tantek
and then we could have had protocol relative URLs by starting them with ":" rather than "//"
#
barnabywalters
aaronpk: do you do anything special to handle query string params when translating URL -> filesystem path?
#
aaronpk
not yet
#
tantek
better escape those "." :)
#
tantek
or maybe just drop webmention sources with ".." in the URL
tpinto joined the channel
#
tantek
since we're talking about auto-ignoring spam anyway
#
tantek
might as well auto-ignore such sketchiness as well
#
barnabywalters
yeah, there’s probably no good reason to support mentions from people with .. in the URL
#
XgF
Though still, you probably want an escape sequence for query strings and such
#
tantek
have yet to see a real world use case
#
XgF
Maybe "__", with "__" -> "___", "?" -> "__q" or similar?
#
tantek
I'd say start with blanket dropping such URLs
#
tantek
with ".."
#
aaronpk
yeah so far nobody publishes pages with query strings
#
tantek
and wait for someone to complain with a real world use case
#
tantek
until then, tough.
#
aaronpk
s/pages/post pages/
#
Loqi
aaronpk meant to say: yeah so far nobody publishes post pages with query strings
#
barnabywalters
okay, so dropping query string and ignoring any .. in the URL for the moment
#
tantek
no one says you have to accept webmentions from any type of URL
#
tantek
query strings make for bad permalinks anyway
#
tantek
that could be another webmention filter
#
tantek
use well design URLs or your webmention gets ignored :P
#
tantek
could even have an error code for that
#
aaronpk
yeah, they're called "query" strings after all... not perma-strings
#
Loqi
aaronpk has 293 karma
#
tantek
461 Source URL is fugly
#
aaronpk
hah! an opinionated HTTP server?
#
XgF
410 Gone (your URL scared my webmention endpoint)
#
barnabywalters
462 Volcano Fax Number: Source is using schema.org
warden joined the channel
#
tantek
that's too unforgiving. the webmention sender should be given a chance to clean up their source URL and try again. 410 implies don't webmention me ever again.
#
tantek
(410 that is, is too unforgiving)
#
tantek
barnabywalters LOL
#
tantek.com
created /webhosting (+25) "r"
(view diff)
#
tantek
barnabywalters - actually - there's 451 which is similar - Unavailable For Legal Reasons
#
barnabywalters
wow, didn’t know that one existed
#
tantek
as in Fahrenheit 451
#
tantek
but what we want to communicate is not that the requested URL is unavailable for legal reasons, but rather that we can't parse your source URL for legal reasons
#
tantek
so perhaps
#
barnabywalters
that one looks much less interesting
#
tantek
so for something like webmention where the server is expected to go retrieve a URL provided by the client
#
tantek
452 Source URL unparsable for legal reasons - e.g. schemaorg patent policy does not give patent permissions to parse schema (only to publish)
#
tantek
wow Google is filtering search results for schema within my website. I smell bad code or conspiracy
#
barnabywalters
tantek: or filter bubbling? I see plenty of results for site:tantek.com schema
#
tantek
barnabywalters - try the search box on tantek.com which includes extra parameters for strict date-time ordering of results
#
tantek
none of my schema-related osfw3c posts show up
#
tantek
I have to browse/nav to them explicitly by date
paulcp joined the channel
#
barnabywalters
huh, creepy
#
tantek
so yeah, return a 452 if someone sends a URL to HTML with schema markup, per: http://tantek.com/2013/219/t20/osfw3c-schema-org-bad-for-open-web
#
tantek
3 months and they haven't updated their TOS like they claimed they would - so I don't believe/trust anyone at Google about schema any more.
#
KevinMarks_
post pages with query strings are quite common in blogland
#
barnabywalters
KevinMarks2: referring to all the wordpress sites which use /?p=1
#
KevinMarks_
and other PHP based things
#
barnabywalters
does wordpress automatically include a better canonical URL even if the page is accessed using that URL, or do you need a plugin for that?
#
KevinMarks_
I fought this with rel=tag, fairly succesfully (though now it's used as reason for not using rel=tag)
#
KevinMarks_
I think WP does better canonical URLs now, by default
#
KevinMarks_
they used to use tag links with ? params too, which we policed out as an antipattern with technorati indexing
#
snarfed
barnabywalters: once it's published publicly, yes, and automatically redirects to it
andreypopp, mko, tantek, pfefferle and bnvk joined the channel
#
@ShaneHudson
@adactio It really is starting to look like #IndieWeb is the *only* way forward. Otherwise the web is not going to be a nice place.
(twitter.com/_/status/401488487058112512)
josephboyle, ryana, paulcp, caseorganic, caseorga_, tantek and abrereton joined the channel